* question about relative control for sync io using bfq @ 2021-01-11 13:15 yukuai (C) 2021-01-14 12:24 ` Yu Kuai ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: yukuai (C) @ 2021-01-11 13:15 UTC (permalink / raw) To: axboe, Ming Lei, hch, linux-block, chenzhou, houtao (A) Hi, We found a performance problem: kernel version: 5.10 disk: ssd scheduler: bfq arch: arm64 / x86_64 test param: direct=1, ioengine=psync, bs=4k, rw=randread, numjobs=32 We are using 32 threads here, test results showed that iops is equal to single thread. After digging into the problem, I found root cause of the problem is strange: bfq_add_request bfq_bfqq_handle_idle_busy_switch bfq_add_bfqq_busy bfq_activate_bfq bfq_activate_requeue_entity __bfq_activate_requeue_entity __bfq_activate_entity if (!bfq_entity_to_bfqq(entity)) if (!entity->in_groups_with_pending_reqs) entity->in_groups_with_pending_reqs = true; bfqd->num_groups_with_pending_reqs++ If test process is not in root cgroup, num_groups_with_pending_reqs will be increased after request was instered to bfq. bfq_select_queue bfq_better_to_idle idling_needed_for_service_guarantees bfq_asymmetric_scenario return varied_queue_weights || multiple_classes_busy || bfqd->num_groups_with_pending_reqs > 0 After issuing IO to driver, num_groups_with_pending_reqs is ensured to be nonzero, thus bfq won't expire the queue. This is the root cause of degradating to single-process performance. One the other hand, if I set slice_idle to zero, bfq_better_to_idle will return false early, and the problem will disapear. However, relative control will be inactive. My question is that, is this a known flaw for bfq? If not, as cfq don't have such problem, is there a suitable solution? Thanks! Yu Kuai such problem, ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-14 12:24 ` Yu Kuai 0 siblings, 0 replies; 20+ messages in thread From: Yu Kuai @ 2021-01-14 12:24 UTC (permalink / raw) To: tj, axboe, paolo.valente Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang Now the group scheduling in BFQ depends on the check of active group, but in most cases group scheduling is not used and the checking of active group will cause bfq_asymmetric_scenario() and its caller bfq_better_to_idle() to always return true, so the throughput will be impacted if the workload doesn't need idle (e.g. random rw) To fix that, adding check in bfq_io_set_weight_legacy() and bfq_pd_init() to check whether or not group scheduling is used (a non-default weight is used). If not, there is no need to check active group. Signed-off-by: Yu Kuai <yukuai3@huawei.com> --- block/bfq-cgroup.c | 14 ++++++++++++-- block/bfq-iosched.c | 8 +++----- block/bfq-iosched.h | 19 +++++++++++++++++++ 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index b791e2041e49..b4ac42c4bd9f 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -505,12 +505,18 @@ static struct blkcg_policy_data *bfq_cpd_alloc(gfp_t gfp) return &bgd->pd; } +static inline int bfq_dft_weight(void) +{ + return cgroup_subsys_on_dfl(io_cgrp_subsys) ? + CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; + +} + static void bfq_cpd_init(struct blkcg_policy_data *cpd) { struct bfq_group_data *d = cpd_to_bfqgd(cpd); - d->weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ? - CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; + d->weight = bfq_dft_weight(); } static void bfq_cpd_free(struct blkcg_policy_data *cpd) @@ -554,6 +560,9 @@ static void bfq_pd_init(struct blkg_policy_data *pd) bfqg->bfqd = bfqd; bfqg->active_entities = 0; bfqg->rq_pos_tree = RB_ROOT; + + if (entity->new_weight != bfq_dft_weight()) + bfqd_enable_active_group_check(bfqd); } static void bfq_pd_free(struct blkg_policy_data *pd) @@ -1013,6 +1022,7 @@ static void bfq_group_set_weight(struct bfq_group *bfqg, u64 weight, u64 dev_wei */ smp_wmb(); bfqg->entity.prio_changed = 1; + bfqd_enable_active_group_check(bfqg->bfqd); } } diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 9e4eb0fc1c16..1b695de1df95 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -699,11 +699,8 @@ static bool bfq_asymmetric_scenario(struct bfq_data *bfqd, (bfqd->busy_queues[0] && bfqd->busy_queues[2]) || (bfqd->busy_queues[1] && bfqd->busy_queues[2]); - return varied_queue_weights || multiple_classes_busy -#ifdef CONFIG_BFQ_GROUP_IOSCHED - || bfqd->num_groups_with_pending_reqs > 0 -#endif - ; + return varied_queue_weights || multiple_classes_busy || + bfqd_has_active_group(bfqd); } /* @@ -6472,6 +6469,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) bfqd->queue_weights_tree = RB_ROOT_CACHED; bfqd->num_groups_with_pending_reqs = 0; + bfqd->check_active_group = false; INIT_LIST_HEAD(&bfqd->active_list); INIT_LIST_HEAD(&bfqd->idle_list); diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 703895224562..216509013012 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -524,6 +524,8 @@ struct bfq_data { /* true if the device is non rotational and performs queueing */ bool nonrot_with_queueing; + /* true if need to check num_groups_with_pending_reqs */ + bool check_active_group; /* * Maximum number of requests in driver in the last @@ -1066,6 +1068,17 @@ static inline void bfq_pid_to_str(int pid, char *str, int len) } #ifdef CONFIG_BFQ_GROUP_IOSCHED +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) +{ + cmpxchg_relaxed(&bfqd->check_active_group, false, true); +} + +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) +{ + return bfqd->check_active_group && + bfqd->num_groups_with_pending_reqs > 0; +} + struct bfq_group *bfqq_group(struct bfq_queue *bfqq); #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ @@ -1085,6 +1098,12 @@ struct bfq_group *bfqq_group(struct bfq_queue *bfqq); } while (0) #else /* CONFIG_BFQ_GROUP_IOSCHED */ +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) {} + +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) +{ + return false; +} #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ char pid_str[MAX_PID_STR_LENGTH]; \ -- 2.25.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-14 12:24 ` Yu Kuai 0 siblings, 0 replies; 20+ messages in thread From: Yu Kuai @ 2021-01-14 12:24 UTC (permalink / raw) To: tj-DgEjT+Ai2ygdnm+yROfE0A, axboe-tSWWG44O7X1aa/9Udqfwiw, paolo.valente-QSEj5FYQhm4dnm+yROfE0A Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-block-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, yukuai3-hv44wF8Li93QT0dZR+AlfA, yi.zhang-hv44wF8Li93QT0dZR+AlfA Now the group scheduling in BFQ depends on the check of active group, but in most cases group scheduling is not used and the checking of active group will cause bfq_asymmetric_scenario() and its caller bfq_better_to_idle() to always return true, so the throughput will be impacted if the workload doesn't need idle (e.g. random rw) To fix that, adding check in bfq_io_set_weight_legacy() and bfq_pd_init() to check whether or not group scheduling is used (a non-default weight is used). If not, there is no need to check active group. Signed-off-by: Yu Kuai <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> --- block/bfq-cgroup.c | 14 ++++++++++++-- block/bfq-iosched.c | 8 +++----- block/bfq-iosched.h | 19 +++++++++++++++++++ 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index b791e2041e49..b4ac42c4bd9f 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -505,12 +505,18 @@ static struct blkcg_policy_data *bfq_cpd_alloc(gfp_t gfp) return &bgd->pd; } +static inline int bfq_dft_weight(void) +{ + return cgroup_subsys_on_dfl(io_cgrp_subsys) ? + CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; + +} + static void bfq_cpd_init(struct blkcg_policy_data *cpd) { struct bfq_group_data *d = cpd_to_bfqgd(cpd); - d->weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ? - CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; + d->weight = bfq_dft_weight(); } static void bfq_cpd_free(struct blkcg_policy_data *cpd) @@ -554,6 +560,9 @@ static void bfq_pd_init(struct blkg_policy_data *pd) bfqg->bfqd = bfqd; bfqg->active_entities = 0; bfqg->rq_pos_tree = RB_ROOT; + + if (entity->new_weight != bfq_dft_weight()) + bfqd_enable_active_group_check(bfqd); } static void bfq_pd_free(struct blkg_policy_data *pd) @@ -1013,6 +1022,7 @@ static void bfq_group_set_weight(struct bfq_group *bfqg, u64 weight, u64 dev_wei */ smp_wmb(); bfqg->entity.prio_changed = 1; + bfqd_enable_active_group_check(bfqg->bfqd); } } diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 9e4eb0fc1c16..1b695de1df95 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -699,11 +699,8 @@ static bool bfq_asymmetric_scenario(struct bfq_data *bfqd, (bfqd->busy_queues[0] && bfqd->busy_queues[2]) || (bfqd->busy_queues[1] && bfqd->busy_queues[2]); - return varied_queue_weights || multiple_classes_busy -#ifdef CONFIG_BFQ_GROUP_IOSCHED - || bfqd->num_groups_with_pending_reqs > 0 -#endif - ; + return varied_queue_weights || multiple_classes_busy || + bfqd_has_active_group(bfqd); } /* @@ -6472,6 +6469,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) bfqd->queue_weights_tree = RB_ROOT_CACHED; bfqd->num_groups_with_pending_reqs = 0; + bfqd->check_active_group = false; INIT_LIST_HEAD(&bfqd->active_list); INIT_LIST_HEAD(&bfqd->idle_list); diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 703895224562..216509013012 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -524,6 +524,8 @@ struct bfq_data { /* true if the device is non rotational and performs queueing */ bool nonrot_with_queueing; + /* true if need to check num_groups_with_pending_reqs */ + bool check_active_group; /* * Maximum number of requests in driver in the last @@ -1066,6 +1068,17 @@ static inline void bfq_pid_to_str(int pid, char *str, int len) } #ifdef CONFIG_BFQ_GROUP_IOSCHED +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) +{ + cmpxchg_relaxed(&bfqd->check_active_group, false, true); +} + +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) +{ + return bfqd->check_active_group && + bfqd->num_groups_with_pending_reqs > 0; +} + struct bfq_group *bfqq_group(struct bfq_queue *bfqq); #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ @@ -1085,6 +1098,12 @@ struct bfq_group *bfqq_group(struct bfq_queue *bfqq); } while (0) #else /* CONFIG_BFQ_GROUP_IOSCHED */ +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) {} + +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) +{ + return false; +} #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ char pid_str[MAX_PID_STR_LENGTH]; \ -- 2.25.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed 2021-01-14 12:24 ` Yu Kuai (?) @ 2021-01-15 13:35 ` kernel test robot -1 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-01-15 13:35 UTC (permalink / raw) To: Yu Kuai, tj, axboe, paolo.valente Cc: kbuild-all, cgroups, linux-block, linux-kernel, yukuai3, yi.zhang [-- Attachment #1: Type: text/plain, Size: 1654 bytes --] Hi Yu, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on v5.11-rc3 next-20210115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: arm-allmodconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/2a2ab6f73f0608cec85e1f15254edc78a75d0366 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 git checkout 2a2ab6f73f0608cec85e1f15254edc78a75d0366 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "__bad_cmpxchg" [block/bfq.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 78467 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-15 13:35 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-01-15 13:35 UTC (permalink / raw) To: tj, axboe, paolo.valente Cc: kbuild-all, cgroups, linux-block, linux-kernel, yukuai3, yi.zhang [-- Attachment #1: Type: text/plain, Size: 1654 bytes --] Hi Yu, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on v5.11-rc3 next-20210115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: arm-allmodconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/2a2ab6f73f0608cec85e1f15254edc78a75d0366 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 git checkout 2a2ab6f73f0608cec85e1f15254edc78a75d0366 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "__bad_cmpxchg" [block/bfq.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 78467 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-15 13:35 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-01-15 13:35 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 1690 bytes --] Hi Yu, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on v5.11-rc3 next-20210115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: arm-allmodconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/2a2ab6f73f0608cec85e1f15254edc78a75d0366 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 git checkout 2a2ab6f73f0608cec85e1f15254edc78a75d0366 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "__bad_cmpxchg" [block/bfq.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 78467 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed 2021-01-14 12:24 ` Yu Kuai (?) @ 2021-01-15 13:35 ` kernel test robot -1 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-01-15 13:35 UTC (permalink / raw) To: Yu Kuai, tj, axboe, paolo.valente Cc: kbuild-all, cgroups, linux-block, linux-kernel, yukuai3, yi.zhang [-- Attachment #1: Type: text/plain, Size: 2014 bytes --] Hi Yu, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on v5.11-rc3 next-20210115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: arm-allyesconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/2a2ab6f73f0608cec85e1f15254edc78a75d0366 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 git checkout 2a2ab6f73f0608cec85e1f15254edc78a75d0366 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_pd_init': >> bfq-cgroup.c:(.text+0x2f0): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_io_set_weight_legacy': bfq-cgroup.c:(.text+0x448): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_io_set_weight': bfq-cgroup.c:(.text+0x1390): undefined reference to `__bad_cmpxchg' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 77904 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-15 13:35 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-01-15 13:35 UTC (permalink / raw) To: tj, axboe, paolo.valente Cc: kbuild-all, cgroups, linux-block, linux-kernel, yukuai3, yi.zhang [-- Attachment #1: Type: text/plain, Size: 2014 bytes --] Hi Yu, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on v5.11-rc3 next-20210115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: arm-allyesconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/2a2ab6f73f0608cec85e1f15254edc78a75d0366 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 git checkout 2a2ab6f73f0608cec85e1f15254edc78a75d0366 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_pd_init': >> bfq-cgroup.c:(.text+0x2f0): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_io_set_weight_legacy': bfq-cgroup.c:(.text+0x448): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_io_set_weight': bfq-cgroup.c:(.text+0x1390): undefined reference to `__bad_cmpxchg' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 77904 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-15 13:35 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2021-01-15 13:35 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 2055 bytes --] Hi Yu, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on v5.11-rc3 next-20210115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: arm-allyesconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/2a2ab6f73f0608cec85e1f15254edc78a75d0366 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Yu-Kuai/bfq-don-t-check-active-group-if-bfq-weight-is-not-changed/20210115-112031 git checkout 2a2ab6f73f0608cec85e1f15254edc78a75d0366 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_pd_init': >> bfq-cgroup.c:(.text+0x2f0): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_io_set_weight_legacy': bfq-cgroup.c:(.text+0x448): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: block/bfq-cgroup.o: in function `bfq_io_set_weight': bfq-cgroup.c:(.text+0x1390): undefined reference to `__bad_cmpxchg' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 77904 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-22 9:46 ` Paolo Valente 0 siblings, 0 replies; 20+ messages in thread From: Paolo Valente @ 2021-01-22 9:46 UTC (permalink / raw) To: Yu Kuai; +Cc: Tejun Heo, axboe, cgroups, linux-block, linux-kernel, yi.zhang > Il giorno 14 gen 2021, alle ore 13:24, Yu Kuai <yukuai3@huawei.com> ha scritto: > > Now the group scheduling in BFQ depends on the check of active group, > but in most cases group scheduling is not used and the checking > of active group will cause bfq_asymmetric_scenario() and its caller > bfq_better_to_idle() to always return true, so the throughput > will be impacted if the workload doesn't need idle (e.g. random rw) > > To fix that, adding check in bfq_io_set_weight_legacy() and > bfq_pd_init() to check whether or not group scheduling is used > (a non-default weight is used). If not, there is no need > to check active group. > Hi, I do like the goal you want to attain. Still, I see a problem with your proposal. Consider two groups, say A and B. Suppose that both have the same, default weight. Yet, group A generates large I/O requests, while group B generates small requests. With your change, idling would not be performed. This would cause group A to steal bandwidth to group B, in proportion to how large its requests are compared with those of group B. As a possible solution, maybe we would need also a varied_rq_size flag, similar to the varied_weights flag? Thoughts? Thanks for your contribution, Paolo > Signed-off-by: Yu Kuai <yukuai3@huawei.com> > --- > block/bfq-cgroup.c | 14 ++++++++++++-- > block/bfq-iosched.c | 8 +++----- > block/bfq-iosched.h | 19 +++++++++++++++++++ > 3 files changed, 34 insertions(+), 7 deletions(-) > > diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c > index b791e2041e49..b4ac42c4bd9f 100644 > --- a/block/bfq-cgroup.c > +++ b/block/bfq-cgroup.c > @@ -505,12 +505,18 @@ static struct blkcg_policy_data *bfq_cpd_alloc(gfp_t gfp) > return &bgd->pd; > } > > +static inline int bfq_dft_weight(void) > +{ > + return cgroup_subsys_on_dfl(io_cgrp_subsys) ? > + CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; > + > +} > + > static void bfq_cpd_init(struct blkcg_policy_data *cpd) > { > struct bfq_group_data *d = cpd_to_bfqgd(cpd); > > - d->weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ? > - CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; > + d->weight = bfq_dft_weight(); > } > > static void bfq_cpd_free(struct blkcg_policy_data *cpd) > @@ -554,6 +560,9 @@ static void bfq_pd_init(struct blkg_policy_data *pd) > bfqg->bfqd = bfqd; > bfqg->active_entities = 0; > bfqg->rq_pos_tree = RB_ROOT; > + > + if (entity->new_weight != bfq_dft_weight()) > + bfqd_enable_active_group_check(bfqd); > } > > static void bfq_pd_free(struct blkg_policy_data *pd) > @@ -1013,6 +1022,7 @@ static void bfq_group_set_weight(struct bfq_group *bfqg, u64 weight, u64 dev_wei > */ > smp_wmb(); > bfqg->entity.prio_changed = 1; > + bfqd_enable_active_group_check(bfqg->bfqd); > } > } > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index 9e4eb0fc1c16..1b695de1df95 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -699,11 +699,8 @@ static bool bfq_asymmetric_scenario(struct bfq_data *bfqd, > (bfqd->busy_queues[0] && bfqd->busy_queues[2]) || > (bfqd->busy_queues[1] && bfqd->busy_queues[2]); > > - return varied_queue_weights || multiple_classes_busy > -#ifdef CONFIG_BFQ_GROUP_IOSCHED > - || bfqd->num_groups_with_pending_reqs > 0 > -#endif > - ; > + return varied_queue_weights || multiple_classes_busy || > + bfqd_has_active_group(bfqd); > } > > /* > @@ -6472,6 +6469,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) > > bfqd->queue_weights_tree = RB_ROOT_CACHED; > bfqd->num_groups_with_pending_reqs = 0; > + bfqd->check_active_group = false; > > INIT_LIST_HEAD(&bfqd->active_list); > INIT_LIST_HEAD(&bfqd->idle_list); > diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h > index 703895224562..216509013012 100644 > --- a/block/bfq-iosched.h > +++ b/block/bfq-iosched.h > @@ -524,6 +524,8 @@ struct bfq_data { > > /* true if the device is non rotational and performs queueing */ > bool nonrot_with_queueing; > + /* true if need to check num_groups_with_pending_reqs */ > + bool check_active_group; > > /* > * Maximum number of requests in driver in the last > @@ -1066,6 +1068,17 @@ static inline void bfq_pid_to_str(int pid, char *str, int len) > } > > #ifdef CONFIG_BFQ_GROUP_IOSCHED > +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) > +{ > + cmpxchg_relaxed(&bfqd->check_active_group, false, true); > +} > + > +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) > +{ > + return bfqd->check_active_group && > + bfqd->num_groups_with_pending_reqs > 0; > +} > + > struct bfq_group *bfqq_group(struct bfq_queue *bfqq); > > #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ > @@ -1085,6 +1098,12 @@ struct bfq_group *bfqq_group(struct bfq_queue *bfqq); > } while (0) > > #else /* CONFIG_BFQ_GROUP_IOSCHED */ > +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) {} > + > +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) > +{ > + return false; > +} > > #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ > char pid_str[MAX_PID_STR_LENGTH]; \ > -- > 2.25.4 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] bfq: don't check active group if bfq.weight is not changed @ 2021-01-22 9:46 ` Paolo Valente 0 siblings, 0 replies; 20+ messages in thread From: Paolo Valente @ 2021-01-22 9:46 UTC (permalink / raw) To: Yu Kuai Cc: Tejun Heo, axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-block-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, yi.zhang-hv44wF8Li93QT0dZR+AlfA > Il giorno 14 gen 2021, alle ore 13:24, Yu Kuai <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> ha scritto: > > Now the group scheduling in BFQ depends on the check of active group, > but in most cases group scheduling is not used and the checking > of active group will cause bfq_asymmetric_scenario() and its caller > bfq_better_to_idle() to always return true, so the throughput > will be impacted if the workload doesn't need idle (e.g. random rw) > > To fix that, adding check in bfq_io_set_weight_legacy() and > bfq_pd_init() to check whether or not group scheduling is used > (a non-default weight is used). If not, there is no need > to check active group. > Hi, I do like the goal you want to attain. Still, I see a problem with your proposal. Consider two groups, say A and B. Suppose that both have the same, default weight. Yet, group A generates large I/O requests, while group B generates small requests. With your change, idling would not be performed. This would cause group A to steal bandwidth to group B, in proportion to how large its requests are compared with those of group B. As a possible solution, maybe we would need also a varied_rq_size flag, similar to the varied_weights flag? Thoughts? Thanks for your contribution, Paolo > Signed-off-by: Yu Kuai <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> > --- > block/bfq-cgroup.c | 14 ++++++++++++-- > block/bfq-iosched.c | 8 +++----- > block/bfq-iosched.h | 19 +++++++++++++++++++ > 3 files changed, 34 insertions(+), 7 deletions(-) > > diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c > index b791e2041e49..b4ac42c4bd9f 100644 > --- a/block/bfq-cgroup.c > +++ b/block/bfq-cgroup.c > @@ -505,12 +505,18 @@ static struct blkcg_policy_data *bfq_cpd_alloc(gfp_t gfp) > return &bgd->pd; > } > > +static inline int bfq_dft_weight(void) > +{ > + return cgroup_subsys_on_dfl(io_cgrp_subsys) ? > + CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; > + > +} > + > static void bfq_cpd_init(struct blkcg_policy_data *cpd) > { > struct bfq_group_data *d = cpd_to_bfqgd(cpd); > > - d->weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ? > - CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL; > + d->weight = bfq_dft_weight(); > } > > static void bfq_cpd_free(struct blkcg_policy_data *cpd) > @@ -554,6 +560,9 @@ static void bfq_pd_init(struct blkg_policy_data *pd) > bfqg->bfqd = bfqd; > bfqg->active_entities = 0; > bfqg->rq_pos_tree = RB_ROOT; > + > + if (entity->new_weight != bfq_dft_weight()) > + bfqd_enable_active_group_check(bfqd); > } > > static void bfq_pd_free(struct blkg_policy_data *pd) > @@ -1013,6 +1022,7 @@ static void bfq_group_set_weight(struct bfq_group *bfqg, u64 weight, u64 dev_wei > */ > smp_wmb(); > bfqg->entity.prio_changed = 1; > + bfqd_enable_active_group_check(bfqg->bfqd); > } > } > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index 9e4eb0fc1c16..1b695de1df95 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -699,11 +699,8 @@ static bool bfq_asymmetric_scenario(struct bfq_data *bfqd, > (bfqd->busy_queues[0] && bfqd->busy_queues[2]) || > (bfqd->busy_queues[1] && bfqd->busy_queues[2]); > > - return varied_queue_weights || multiple_classes_busy > -#ifdef CONFIG_BFQ_GROUP_IOSCHED > - || bfqd->num_groups_with_pending_reqs > 0 > -#endif > - ; > + return varied_queue_weights || multiple_classes_busy || > + bfqd_has_active_group(bfqd); > } > > /* > @@ -6472,6 +6469,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) > > bfqd->queue_weights_tree = RB_ROOT_CACHED; > bfqd->num_groups_with_pending_reqs = 0; > + bfqd->check_active_group = false; > > INIT_LIST_HEAD(&bfqd->active_list); > INIT_LIST_HEAD(&bfqd->idle_list); > diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h > index 703895224562..216509013012 100644 > --- a/block/bfq-iosched.h > +++ b/block/bfq-iosched.h > @@ -524,6 +524,8 @@ struct bfq_data { > > /* true if the device is non rotational and performs queueing */ > bool nonrot_with_queueing; > + /* true if need to check num_groups_with_pending_reqs */ > + bool check_active_group; > > /* > * Maximum number of requests in driver in the last > @@ -1066,6 +1068,17 @@ static inline void bfq_pid_to_str(int pid, char *str, int len) > } > > #ifdef CONFIG_BFQ_GROUP_IOSCHED > +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) > +{ > + cmpxchg_relaxed(&bfqd->check_active_group, false, true); > +} > + > +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) > +{ > + return bfqd->check_active_group && > + bfqd->num_groups_with_pending_reqs > 0; > +} > + > struct bfq_group *bfqq_group(struct bfq_queue *bfqq); > > #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ > @@ -1085,6 +1098,12 @@ struct bfq_group *bfqq_group(struct bfq_queue *bfqq); > } while (0) > > #else /* CONFIG_BFQ_GROUP_IOSCHED */ > +static inline void bfqd_enable_active_group_check(struct bfq_data *bfqd) {} > + > +static inline bool bfqd_has_active_group(struct bfq_data *bfqd) > +{ > + return false; > +} > > #define bfq_log_bfqq(bfqd, bfqq, fmt, args...) do { \ > char pid_str[MAX_PID_STR_LENGTH]; \ > -- > 2.25.4 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-01-11 13:15 question about relative control for sync io using bfq yukuai (C) 2021-01-14 12:24 ` Yu Kuai @ 2021-01-16 8:59 ` yukuai (C) 2021-01-16 10:45 ` Paolo Valente 2021-01-22 10:09 ` Paolo Valente 2 siblings, 1 reply; 20+ messages in thread From: yukuai (C) @ 2021-01-16 8:59 UTC (permalink / raw) To: axboe, Ming Lei, hch, linux-block, chenzhou, houtao (A) ping... On 2021/01/11 21:15, yukuai (C) wrote: > Hi, > > We found a performance problem: > > kernel version: 5.10 > disk: ssd > scheduler: bfq > arch: arm64 / x86_64 > test param: direct=1, ioengine=psync, bs=4k, rw=randread, numjobs=32 > > We are using 32 threads here, test results showed that iops is equal > to single thread. > > After digging into the problem, I found root cause of the problem is > strange: > > bfq_add_request > bfq_bfqq_handle_idle_busy_switch > bfq_add_bfqq_busy > bfq_activate_bfq > bfq_activate_requeue_entity > __bfq_activate_requeue_entity > __bfq_activate_entity > if (!bfq_entity_to_bfqq(entity)) > if (!entity->in_groups_with_pending_reqs) > entity->in_groups_with_pending_reqs = true; > bfqd->num_groups_with_pending_reqs++ > > If test process is not in root cgroup, num_groups_with_pending_reqs will > be increased after request was instered to bfq. > > bfq_select_queue > bfq_better_to_idle > idling_needed_for_service_guarantees > bfq_asymmetric_scenario > return varied_queue_weights || multiple_classes_busy || > bfqd->num_groups_with_pending_reqs > 0 > > After issuing IO to driver, num_groups_with_pending_reqs is ensured to > be nonzero, thus bfq won't expire the queue. This is the root cause of > degradating to single-process performance. > > One the other hand, if I set slice_idle to zero, bfq_better_to_idle will > return false early, and the problem will disapear. However, relative > control will be inactive. > > My question is that, is this a known flaw for bfq? If not, as cfq don't > have such problem, is there a suitable solution? > > Thanks! > Yu Kuai > such problem, ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-01-16 8:59 ` question about relative control for sync io using bfq yukuai (C) @ 2021-01-16 10:45 ` Paolo Valente 0 siblings, 0 replies; 20+ messages in thread From: Paolo Valente @ 2021-01-16 10:45 UTC (permalink / raw) To: yukuai (C); +Cc: Jens Axboe, Ming Lei, hch, linux-block, chenzhou, houtao (A) Hi, give me a few days, unfortunately my time is very limited. Thanks for reporting this interesting problem, Paolo > Il giorno 16 gen 2021, alle ore 09:59, yukuai (C) <yukuai3@huawei.com> ha scritto: > > ping... > > On 2021/01/11 21:15, yukuai (C) wrote: >> Hi, >> We found a performance problem: >> kernel version: 5.10 >> disk: ssd >> scheduler: bfq >> arch: arm64 / x86_64 >> test param: direct=1, ioengine=psync, bs=4k, rw=randread, numjobs=32 >> We are using 32 threads here, test results showed that iops is equal >> to single thread. >> After digging into the problem, I found root cause of the problem is strange: >> bfq_add_request >> bfq_bfqq_handle_idle_busy_switch >> bfq_add_bfqq_busy >> bfq_activate_bfq >> bfq_activate_requeue_entity >> __bfq_activate_requeue_entity >> __bfq_activate_entity >> if (!bfq_entity_to_bfqq(entity)) >> if (!entity->in_groups_with_pending_reqs) >> entity->in_groups_with_pending_reqs = true; >> bfqd->num_groups_with_pending_reqs++ >> If test process is not in root cgroup, num_groups_with_pending_reqs will >> be increased after request was instered to bfq. >> bfq_select_queue >> bfq_better_to_idle >> idling_needed_for_service_guarantees >> bfq_asymmetric_scenario >> return varied_queue_weights || multiple_classes_busy || bfqd->num_groups_with_pending_reqs > 0 >> After issuing IO to driver, num_groups_with_pending_reqs is ensured to >> be nonzero, thus bfq won't expire the queue. This is the root cause of >> degradating to single-process performance. >> One the other hand, if I set slice_idle to zero, bfq_better_to_idle will >> return false early, and the problem will disapear. However, relative >> control will be inactive. >> My question is that, is this a known flaw for bfq? If not, as cfq don't >> have such problem, is there a suitable solution? >> Thanks! >> Yu Kuai >> such problem, ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-01-11 13:15 question about relative control for sync io using bfq yukuai (C) 2021-01-14 12:24 ` Yu Kuai 2021-01-16 8:59 ` question about relative control for sync io using bfq yukuai (C) @ 2021-01-22 10:09 ` Paolo Valente [not found] ` <7c28a80f-dea9-d701-0399-a22522c4509b@huawei.com> 2 siblings, 1 reply; 20+ messages in thread From: Paolo Valente @ 2021-01-22 10:09 UTC (permalink / raw) To: yukuai (C); +Cc: Jens Axboe, Ming Lei, hch, linux-block, chenzhou, houtao (A) > Il giorno 11 gen 2021, alle ore 14:15, yukuai (C) <yukuai3@huawei.com> ha scritto: > > Hi, > > We found a performance problem: > > kernel version: 5.10 > disk: ssd > scheduler: bfq > arch: arm64 / x86_64 > test param: direct=1, ioengine=psync, bs=4k, rw=randread, numjobs=32 > > We are using 32 threads here, test results showed that iops is equal > to single thread. > > After digging into the problem, I found root cause of the problem is strange: > > bfq_add_request > bfq_bfqq_handle_idle_busy_switch > bfq_add_bfqq_busy > bfq_activate_bfq > bfq_activate_requeue_entity > __bfq_activate_requeue_entity > __bfq_activate_entity > if (!bfq_entity_to_bfqq(entity)) > if (!entity->in_groups_with_pending_reqs) > entity->in_groups_with_pending_reqs = true; > bfqd->num_groups_with_pending_reqs++ > > If test process is not in root cgroup, num_groups_with_pending_reqs will > be increased after request was instered to bfq. > > bfq_select_queue > bfq_better_to_idle > idling_needed_for_service_guarantees > bfq_asymmetric_scenario > return varied_queue_weights || multiple_classes_busy || bfqd->num_groups_with_pending_reqs > 0 > > After issuing IO to driver, num_groups_with_pending_reqs is ensured to > be nonzero, thus bfq won't expire the queue. This is the root cause of > degradating to single-process performance. > > One the other hand, if I set slice_idle to zero, bfq_better_to_idle will > return false early, and the problem will disapear. However, relative > control will be inactive. > > My question is that, is this a known flaw for bfq? If not, as cfq don't > have such problem, is there a suitable solution? > Hi, this is a core problem, not of BFQ but of any possible solution that has to provide bandwidth isolation with sync I/O. One of the examples is the one I made for you in my other email. At any rate, the problem that you report seems to occur with just one group. We may think of simply changing my condition bfqd->num_groups_with_pending_reqs > 0 to bfqd->num_groups_with_pending_reqs > 1 If this simple solution does solve the problem you report, then I could run my batch of tests to check whether it causes some regression. What do you think? Thanks. Paolo > Thanks! > Yu Kuai > such problem, ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <7c28a80f-dea9-d701-0399-a22522c4509b@huawei.com>]
* Re: question about relative control for sync io using bfq [not found] ` <7c28a80f-dea9-d701-0399-a22522c4509b@huawei.com> @ 2021-02-05 7:49 ` Paolo Valente 2021-02-07 12:49 ` yukuai (C) 0 siblings, 1 reply; 20+ messages in thread From: Paolo Valente @ 2021-02-05 7:49 UTC (permalink / raw) To: yukuai (C) Cc: Jens Axboe, Ming Lei, Christoph Hellwig, linux-block, chenzhou, houtao (A) > Il giorno 29 gen 2021, alle ore 09:28, yukuai (C) <yukuai3@huawei.com> ha scritto: > > Hi, > > Thanks for your response, and my apologize for the delay, my tmie > is very limited recently. > I do know that problem ... > On 2021/01/22 18:09, Paolo Valente wrote: >> Hi, >> this is a core problem, not of BFQ but of any possible solution that >> has to provide bandwidth isolation with sync I/O. One of the examples > > I'm not sure about this, so I test it with iocost in mq and cfq in sq, > result shows that they do can provide bandwidth isolation with sync I/O > without significant performance degradation. Yep, that means just that, with your specific workload, bandwidth isolation gets guaranteed without idling. So that's exactly one of the workloads for which I'm suggesting my handling of a special case. >> is the one I made for you in my other email. At any rate, the problem >> that you report seems to occur with just one group. We may think of >> simply changing my condition >> bfqd->num_groups_with_pending_reqs > 0 >> to >> bfqd->num_groups_with_pending_reqs > 1 > > We aredy tried this, the problem will dispeare if only one group is > active. And I think this modification is reasonable because > bandwidth isolation is not necessary in this case. > Thanks for your feedback. I'll consider submitting this change. > However, considering the common case, when more than one > group is active, and one of the group is issuing sync IO, I think > we need to find a way to prevent the preformance degradation. I agree. What do you think of my suggestion for solving the problem? Might you help with that? Thanks, Paolo >> If this simple solution does solve the problem you report, then I >> could run my batch of tests to check whether it causes some >> regression. >> What do you think? >> Thanks. >> Paolo > > Thanks > Yu Kuai >> . ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-02-05 7:49 ` Paolo Valente @ 2021-02-07 12:49 ` yukuai (C) 2021-02-08 19:05 ` Paolo Valente 0 siblings, 1 reply; 20+ messages in thread From: yukuai (C) @ 2021-02-07 12:49 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, Ming Lei, Christoph Hellwig, linux-block, chenzhou, houtao (A) On 2021/02/05 15:49, Paolo Valente wrote: > > >> Il giorno 29 gen 2021, alle ore 09:28, yukuai (C) <yukuai3@huawei.com> ha scritto: >> >> Hi, >> >> Thanks for your response, and my apologize for the delay, my tmie >> is very limited recently. >> > > I do know that problem ... > >> On 2021/01/22 18:09, Paolo Valente wrote: >>> Hi, >>> this is a core problem, not of BFQ but of any possible solution that >>> has to provide bandwidth isolation with sync I/O. One of the examples >> >> I'm not sure about this, so I test it with iocost in mq and cfq in sq, >> result shows that they do can provide bandwidth isolation with sync I/O >> without significant performance degradation. > > Yep, that means just that, with your specific workload, bandwidth > isolation gets guaranteed without idling. So that's exactly one of > the workloads for which I'm suggesting my handling of a special case. > > >>> is the one I made for you in my other email. At any rate, the problem >>> that you report seems to occur with just one group. We may think of >>> simply changing my condition >>> bfqd->num_groups_with_pending_reqs > 0 >>> to >>> bfqd->num_groups_with_pending_reqs > 1 >> >> We aredy tried this, the problem will dispeare if only one group is >> active. And I think this modification is reasonable because >> bandwidth isolation is not necessary in this case. >> > > Thanks for your feedback. I'll consider submitting this change. > >> However, considering the common case, when more than one >> group is active, and one of the group is issuing sync IO, I think >> we need to find a way to prevent the preformance degradation. > > I agree. What do you think of my suggestion for solving the problem? > Might you help with that? Hi Do you mead the suggestion that you mentioned in another email: "a varied_rq_size flag, similar to the varied_weights flag" ? I'm afraid that's just a circumvention plan, not a solution to the special case. By the way, I'm glad if there is anything I can help, however it'll wait for a few days cause the Spring Festival is coming. Thanks, Yu Kuai > > Thanks, > Paolo > >>> If this simple solution does solve the problem you report, then I >>> could run my batch of tests to check whether it causes some >>> regression. >>> What do you think? >>> Thanks. >>> Paolo >> >> Thanks >> Yu Kuai >>> . > > . > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-02-07 12:49 ` yukuai (C) @ 2021-02-08 19:05 ` Paolo Valente 2021-02-19 12:03 ` yukuai (C) 0 siblings, 1 reply; 20+ messages in thread From: Paolo Valente @ 2021-02-08 19:05 UTC (permalink / raw) To: yukuai (C) Cc: Jens Axboe, Ming Lei, Christoph Hellwig, linux-block, chenzhou, houtao (A) > Il giorno 7 feb 2021, alle ore 13:49, yukuai (C) <yukuai3@huawei.com> ha scritto: > > > On 2021/02/05 15:49, Paolo Valente wrote: >>> Il giorno 29 gen 2021, alle ore 09:28, yukuai (C) <yukuai3@huawei.com> ha scritto: >>> >>> Hi, >>> >>> Thanks for your response, and my apologize for the delay, my tmie >>> is very limited recently. >>> >> I do know that problem ... >>> On 2021/01/22 18:09, Paolo Valente wrote: >>>> Hi, >>>> this is a core problem, not of BFQ but of any possible solution that >>>> has to provide bandwidth isolation with sync I/O. One of the examples >>> >>> I'm not sure about this, so I test it with iocost in mq and cfq in sq, >>> result shows that they do can provide bandwidth isolation with sync I/O >>> without significant performance degradation. >> Yep, that means just that, with your specific workload, bandwidth >> isolation gets guaranteed without idling. So that's exactly one of >> the workloads for which I'm suggesting my handling of a special case. >>>> is the one I made for you in my other email. At any rate, the problem >>>> that you report seems to occur with just one group. We may think of >>>> simply changing my condition >>>> bfqd->num_groups_with_pending_reqs > 0 >>>> to >>>> bfqd->num_groups_with_pending_reqs > 1 >>> >>> We aredy tried this, the problem will dispeare if only one group is >>> active. And I think this modification is reasonable because >>> bandwidth isolation is not necessary in this case. >>> >> Thanks for your feedback. I'll consider submitting this change. >>> However, considering the common case, when more than one >>> group is active, and one of the group is issuing sync IO, I think >>> we need to find a way to prevent the preformance degradation. >> I agree. What do you think of my suggestion for solving the problem? >> Might you help with that? > > Hi > > Do you mead the suggestion that you mentioned in another email: > "a varied_rq_size flag, similar to the varied_weights flag" ? > I'm afraid that's just a circumvention plan, not a solution to the > special case. > I'm a little confused. Could you explain why you think this is a circumvention plan? Maybe even better, could you describe in detail the special case you have in mind? We could start from there, to think of a possible, satisfactory solution. > By the way, I'm glad if there is anything I can help, however it'll > wait for a few days cause the Spring Festival is coming. > Ok, Happy Spring Festival then. Thanks. Paolo > Thanks, > Yu Kuai > >> Thanks, >> Paolo >>>> If this simple solution does solve the problem you report, then I >>>> could run my batch of tests to check whether it causes some >>>> regression. >>>> What do you think? >>>> Thanks. >>>> Paolo >>> >>> Thanks >>> Yu Kuai >>>> . >> . ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-02-08 19:05 ` Paolo Valente @ 2021-02-19 12:03 ` yukuai (C) 2021-02-23 10:52 ` Paolo Valente 0 siblings, 1 reply; 20+ messages in thread From: yukuai (C) @ 2021-02-19 12:03 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, Ming Lei, Christoph Hellwig, linux-block, chenzhou, houtao (A) On 2021/02/09 3:05, Paolo Valente wrote: > > >> Il giorno 7 feb 2021, alle ore 13:49, yukuai (C) <yukuai3@huawei.com> ha scritto: >> >> >> On 2021/02/05 15:49, Paolo Valente wrote: >>>> Il giorno 29 gen 2021, alle ore 09:28, yukuai (C) <yukuai3@huawei.com> ha scritto: >>>> >>>> Hi, >>>> >>>> Thanks for your response, and my apologize for the delay, my tmie >>>> is very limited recently. >>>> >>> I do know that problem ... >>>> On 2021/01/22 18:09, Paolo Valente wrote: >>>>> Hi, >>>>> this is a core problem, not of BFQ but of any possible solution that >>>>> has to provide bandwidth isolation with sync I/O. One of the examples >>>> >>>> I'm not sure about this, so I test it with iocost in mq and cfq in sq, >>>> result shows that they do can provide bandwidth isolation with sync I/O >>>> without significant performance degradation. >>> Yep, that means just that, with your specific workload, bandwidth >>> isolation gets guaranteed without idling. So that's exactly one of >>> the workloads for which I'm suggesting my handling of a special case. >>>>> is the one I made for you in my other email. At any rate, the problem >>>>> that you report seems to occur with just one group. We may think of >>>>> simply changing my condition >>>>> bfqd->num_groups_with_pending_reqs > 0 >>>>> to >>>>> bfqd->num_groups_with_pending_reqs > 1 >>>> >>>> We aredy tried this, the problem will dispeare if only one group is >>>> active. And I think this modification is reasonable because >>>> bandwidth isolation is not necessary in this case. >>>> >>> Thanks for your feedback. I'll consider submitting this change. >>>> However, considering the common case, when more than one >>>> group is active, and one of the group is issuing sync IO, I think >>>> we need to find a way to prevent the preformance degradation. >>> I agree. What do you think of my suggestion for solving the problem? >>> Might you help with that? >> >> Hi >> >> Do you mead the suggestion that you mentioned in another email: >> "a varied_rq_size flag, similar to the varied_weights flag" ? >> I'm afraid that's just a circumvention plan, not a solution to the >> special case. >> > > I'm a little confused. Could you explain why you think this is a > circumvention plan? Maybe even better, could you describe in detail > the special case you have in mind? We could start from there, to think > of a possible, satisfactory solution. > Hi, First of all, there are two conditions to trigger the problem in bfq: a. issuing sync IO concurrently. (I was testing in one cgroup, and I think multiple cgroups is the same.) b. not issuing in root cgroup. The phenomenon is that the performance will degradated to single process. The reason is that bfq_queue will never expired untill BUDGET_TIMEOUT since num_groups_with_pending_reqs will always be nonzero after issuing io to driver, which means that there is only one request in progress(during D2C) at any given moment. I was trying to skip the checking of active group if bfq.weight is not changed, and it's implemented by adding a varible 'check_active_group', it'll only set to true if bfq.weight is changed. This approach will work fine is the weight stay unchanged, and will not be effective if the weight ever changed. This is why I said it's a circumvention plan. By the way, while testing with cfq, I found that the current active cfq queue can be preempted in such special case. I wonder if it's worth referring to bfq. Thanks, Yu Kuai > >> By the way, I'm glad if there is anything I can help, however it'll >> wait for a few days cause the Spring Festival is coming. >> > > Ok, Happy Spring Festival then. > > Thanks. > Paolo > >> Thanks, >> Yu Kuai >> >>> Thanks, >>> Paolo >>>>> If this simple solution does solve the problem you report, then I >>>>> could run my batch of tests to check whether it causes some >>>>> regression. >>>>> What do you think? >>>>> Thanks. >>>>> Paolo >>>> >>>> Thanks >>>> Yu Kuai >>>>> . >>> . > > . > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-02-19 12:03 ` yukuai (C) @ 2021-02-23 10:52 ` Paolo Valente 2021-02-24 9:30 ` yukuai (C) 0 siblings, 1 reply; 20+ messages in thread From: Paolo Valente @ 2021-02-23 10:52 UTC (permalink / raw) To: yukuai (C) Cc: Jens Axboe, Ming Lei, Christoph Hellwig, linux-block, chenzhou, houtao (A) > Il giorno 19 feb 2021, alle ore 13:03, yukuai (C) <yukuai3@huawei.com> ha scritto: > > > On 2021/02/09 3:05, Paolo Valente wrote: >>> Il giorno 7 feb 2021, alle ore 13:49, yukuai (C) <yukuai3@huawei.com> ha scritto: >>> >>> >>> On 2021/02/05 15:49, Paolo Valente wrote: >>>>> Il giorno 29 gen 2021, alle ore 09:28, yukuai (C) <yukuai3@huawei.com> ha scritto: >>>>> >>>>> Hi, >>>>> >>>>> Thanks for your response, and my apologize for the delay, my tmie >>>>> is very limited recently. >>>>> >>>> I do know that problem ... >>>>> On 2021/01/22 18:09, Paolo Valente wrote: >>>>>> Hi, >>>>>> this is a core problem, not of BFQ but of any possible solution that >>>>>> has to provide bandwidth isolation with sync I/O. One of the examples >>>>> >>>>> I'm not sure about this, so I test it with iocost in mq and cfq in sq, >>>>> result shows that they do can provide bandwidth isolation with sync I/O >>>>> without significant performance degradation. >>>> Yep, that means just that, with your specific workload, bandwidth >>>> isolation gets guaranteed without idling. So that's exactly one of >>>> the workloads for which I'm suggesting my handling of a special case. >>>>>> is the one I made for you in my other email. At any rate, the problem >>>>>> that you report seems to occur with just one group. We may think of >>>>>> simply changing my condition >>>>>> bfqd->num_groups_with_pending_reqs > 0 >>>>>> to >>>>>> bfqd->num_groups_with_pending_reqs > 1 >>>>> >>>>> We aredy tried this, the problem will dispeare if only one group is >>>>> active. And I think this modification is reasonable because >>>>> bandwidth isolation is not necessary in this case. >>>>> >>>> Thanks for your feedback. I'll consider submitting this change. >>>>> However, considering the common case, when more than one >>>>> group is active, and one of the group is issuing sync IO, I think >>>>> we need to find a way to prevent the preformance degradation. >>>> I agree. What do you think of my suggestion for solving the problem? >>>> Might you help with that? >>> >>> Hi >>> >>> Do you mead the suggestion that you mentioned in another email: >>> "a varied_rq_size flag, similar to the varied_weights flag" ? >>> I'm afraid that's just a circumvention plan, not a solution to the >>> special case. >>> >> I'm a little confused. Could you explain why you think this is a >> circumvention plan? Maybe even better, could you describe in detail >> the special case you have in mind? We could start from there, to think >> of a possible, satisfactory solution. > Hi, > > First of all, there are two conditions to trigger the problem in bfq: > a. issuing sync IO concurrently. (I was testing in one cgroup, and > I think multiple cgroups is the same.) > b. not issuing in root cgroup. > > The phenomenon is that the performance will degradated to single > process. The reason is that bfq_queue will never expired untill > BUDGET_TIMEOUT since num_groups_with_pending_reqs will always be > nonzero after issuing io to driver, which means that there is only > one request in progress(during D2C) at any given moment. > > I was trying to skip the checking of active group if bfq.weight is not > changed, and it's implemented by adding a varible 'check_active_group', > it'll only set to true if bfq.weight is changed. > > This approach will work fine is the weight stay unchanged, and will > not be effective if the weight ever changed. This is why I said it's > a circumvention plan. > > By the way, while testing with cfq, I found that the current active cfq > queue can be preempted in such special case. I wonder if it's worth > referring to bfq. > Hi, thank you very much for this information. You confirm my suspect: your case is one of those for which idling is not needed. It only kills throughput. Unfortunately, the preemption mechanism of cfq that you cite kills service guaranteed in asymmetric cases. That's why I have never added it to bfq. Turning to solutions, now I also understand why you speak about a circumvention plan. Yet the solution I described is not yours, and is effective with any history of weight changes. In a sense, my solution is an evolution of your initial solution. It is conceptually very simple: just track whether all weights and all I/O sizes are equal. When this condition holds true, no idling is needed. When it does not hold, idling is needed. Your case would be solved, service guarantees would be always preserved. Thanks. Paolo > Thanks, > Yu Kuai > >>> By the way, I'm glad if there is anything I can help, however it'll >>> wait for a few days cause the Spring Festival is coming. >>> >> Ok, Happy Spring Festival then. >> Thanks. >> Paolo >>> Thanks, >>> Yu Kuai >>> >>>> Thanks, >>>> Paolo >>>>>> If this simple solution does solve the problem you report, then I >>>>>> could run my batch of tests to check whether it causes some >>>>>> regression. >>>>>> What do you think? >>>>>> Thanks. >>>>>> Paolo >>>>> >>>>> Thanks >>>>> Yu Kuai >>>>>> . >>>> . >> . ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: question about relative control for sync io using bfq 2021-02-23 10:52 ` Paolo Valente @ 2021-02-24 9:30 ` yukuai (C) 0 siblings, 0 replies; 20+ messages in thread From: yukuai (C) @ 2021-02-24 9:30 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, Ming Lei, Christoph Hellwig, linux-block, chenzhou, houtao (A) On 2021/02/23 18:52, Paolo Valente wrote: > In a sense, my solution is an evolution of your initial solution. It > is conceptually very simple: just track whether all weights and all > I/O sizes are equal. When this condition holds true, no idling is > needed. When it does not hold, idling is needed. Your case would be > solved, service guarantees would be always preserved. Hi, The solution sounds good, and it's ture that my case would be solved. Thank you very much! Yu Kuai ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2021-02-24 9:31 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-11 13:15 question about relative control for sync io using bfq yukuai (C) 2021-01-14 12:24 ` [PATCH] bfq: don't check active group if bfq.weight is not changed Yu Kuai 2021-01-14 12:24 ` Yu Kuai 2021-01-15 13:35 ` kernel test robot 2021-01-15 13:35 ` kernel test robot 2021-01-15 13:35 ` kernel test robot 2021-01-15 13:35 ` kernel test robot 2021-01-15 13:35 ` kernel test robot 2021-01-15 13:35 ` kernel test robot 2021-01-22 9:46 ` Paolo Valente 2021-01-22 9:46 ` Paolo Valente 2021-01-16 8:59 ` question about relative control for sync io using bfq yukuai (C) 2021-01-16 10:45 ` Paolo Valente 2021-01-22 10:09 ` Paolo Valente [not found] ` <7c28a80f-dea9-d701-0399-a22522c4509b@huawei.com> 2021-02-05 7:49 ` Paolo Valente 2021-02-07 12:49 ` yukuai (C) 2021-02-08 19:05 ` Paolo Valente 2021-02-19 12:03 ` yukuai (C) 2021-02-23 10:52 ` Paolo Valente 2021-02-24 9:30 ` yukuai (C)
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.