* [PATCH v4] Provide USF for the portable equipment. @ 2020-08-04 7:50 Dongdong Yang 2020-08-04 7:50 ` [PATCH v4] sched: " Dongdong Yang 0 siblings, 1 reply; 8+ messages in thread From: Dongdong Yang @ 2020-08-04 7:50 UTC (permalink / raw) To: gregkh, rjw, viresh.kumar, mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman Cc: linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking From: Dongdong Yang <yangdongdong@xiaomi.com> This patch provides USF(User Sensitive Feedback factor) auxiliary cpufreq governor to support high level layer sysfs inodes setting for utils adjustment purpose from the identified scenario on portable equipment. Because the power consumption and UI response are more cared for by portable equipment users. And the "screen off" status stands for no request from the user, however, the kernel is still expected to notify the user in time on modem, network or powerkey events occur. USF provides "sched_usf_non_ux_r" sysfs inode to cut down the utils from user space tasks according to high level scenario. In addition, it usually hints more cpufreq demand that the preemptive counts of the tasks on the cpu burst and over the user expecting completed time such as the ratio sysctl_sched_latency to sysctl_sched_min_granularity on "screen on" status, which more likely with more UI. The sysfs inodes "sched_usf_up_l0_r" and "sched_usf_down_r" have been provided to adjust the utils according to high level identified scenario to alloc the cpufreq in time. Changes in v4 Based on comments from Greg, Randy and Viresh - Add USF sysfs to ABI - Remove kobj field from usf. - Clean Kconfig left at staging. Changes in v3 Based on comments from Greg, Dietmar, Christoph and Randy - Move usf.c to kernel/sched from staging. - Remove trace_printk and debugfs. - Add document draft. - Update comments. Changes in v2 Based on comments from Steven, Greg, Peter and Dan: - Add adjust_task_pred_set switch. - Move adjust_task_pred_demand declaration into sched.h - Update comments. - Clean usf structure. Changes in v1 Initial USF Dongdong Yang (1): sched: Provide USF for the portable equipment. Documentation/ABI/testing/sysfs-devices-system-cpu | 48 ++++ drivers/cpufreq/Kconfig | 11 + include/trace/events/sched.h | 35 +++ kernel/sched/Makefile | 1 + kernel/sched/cpufreq_schedutil.c | 3 + kernel/sched/sched.h | 10 + kernel/sched/usf.c | 294 +++++++++++++++++++++ 7 files changed, 402 insertions(+) create mode 100644 kernel/sched/usf.c -- 2.7.4 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v4] sched: Provide USF for the portable equipment. 2020-08-04 7:50 [PATCH v4] Provide USF for the portable equipment Dongdong Yang @ 2020-08-04 7:50 ` Dongdong Yang 2020-08-04 8:03 ` Greg KH ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Dongdong Yang @ 2020-08-04 7:50 UTC (permalink / raw) To: gregkh, rjw, viresh.kumar, mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman Cc: linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking From: Dongdong Yang <yangdongdong@xiaomi.com> The power consumption and UI response are more cared for by the portable equipment users. USF(User Sensitive Feedback factor) auxiliary cpufreq governor is providing more utils adjustment settings to the high level by scenario identification. From the view of portable equipment, screen off status usually stands for no request from the user, however, the kernel is still expected to notify the user in time on modem, network or powerkey events occur. In some scenarios, such as listening to music, low power processors, such as DSP, take more actions and CPU load requirements cut down. It would bring more power consumption benefit if high level have interfaces to adjust utils according to the current scenario and load. In addition, the portable equipment user usually heavily interact with devices by touch, and other peripherals. The boost preemptive counts are marking the load requirement urgent, vice versa. If such feedback factor could be set to high level according to the scenario, it would contribute to the power consumption and UI response. If no USF sysfs inode is set, and no screen on or off event, adjust_task_pred_demand shall not be invoked. Once sched_usf_up_l0_r sched_usf_down_r or sched_usf_non_ux_r be set, adjust_task_pred_demand shall be called back to update settings according to high level scenario identification. We can get about 17% mean power consumption save at listening to music with speaker on "screen off" scenario, as below statistical data from 7766 XiaoMi devices for two weeks with sched_usf_non_ux_r be set: day1 day2 day3 day4 count 7766.000000 7766.000000 7766.000000 7766.000000 mean 88.035525 85.500282 83.829305 86.054997 std 111.049980 108.258834 107.562583 108.558240 min 0.099000 0.037000 0.067000 0.045000 25% 34.765500 34.021750 34.101500 34.423000 50% 54.950000 55.286500 54.189500 54.248500 75% 95.954000 93.942000 91.738000 94.0592500 80% 114.675000 107.430000 106.378000 108.673000 85% 137.851000 129.511000 127.156500 131.750750 90% 179.669000 170.208500 164.027000 172.348000 95% 272.395000 257.845500 247.750500 263.275750 98% 399.034500 412.170400 391.484000 402.835600 day5 day6 day7 day8 count 7766.000000 7766.00000 7766.000000 7766.000000 mean 82.532677 79.21923 77.611380 81.075081 std 104.870079 101.34819 103.140037 97.506221 min 0.051000 0.02900 0.007000 0.068000 25% 32.873000 33.44400 31.965500 33.863500 50% 52.180500 51.56550 50.806500 53.080000 75% 90.905750 86.82625 83.859250 89.973000 80% 105.455000 99.64700 97.271000 104.225000 85% 128.300000 118.47825 116.570250 126.648250 90% 166.647500 149.18000 150.649500 161.087000 95% 247.208500 224.36050 226.380000 245.291250 98% 393.002000 347.92060 369.791800 378.778600 day9 day10 day11 day12 count 7766.000000 7766.000000 7766.000000 7766.000000 mean 79.989170 83.859417 78.032930 77.060542 std 104.226122 108.893043 102.561715 99.844276 min 0.118000 0.017000 0.028000 0.039000 25% 32.056250 33.454500 31.176250 30.897750 50% 51.506000 54.056000 48.969500 49.069000 75% 88.513500 92.953500 83.506750 84.096000 80% 102.876000 107.845000 97.717000 98.073000 85% 124.363000 128.288000 118.366500 116.869250 90% 160.557000 167.084000 154.342500 148.187500 95% 231.149000 242.925750 236.759000 228.131250 98% 367.206600 388.619100 385.269100 376.541600 day13 day14 count 7766.000000 7766.000000 mean 75.528036 73.702878 std 90.750594 86.796016 min 0.066000 0.054000 25% 31.170500 31.608500 50% 48.758500 49.215000 75% 84.522750 83.053000 80% 97.879000 94.875000 85% 116.680250 113.573750 90% 149.083500 144.089500 95% 226.177750 211.488750 98% 347.011100 331.317100 Signed-off-by: Dongdong Yang <yangdongdong@xiaomi.com> Co-developed-by: Jun Tao <taojun@xiaomi.com> Co-developed-by: Qiwu Huang <huangqiwu@xiaomi.com> Co-developed-by: Peng Wang <rocking@linux.alibaba.com> --- Documentation/ABI/testing/sysfs-devices-system-cpu | 48 ++++ drivers/cpufreq/Kconfig | 11 + include/trace/events/sched.h | 35 +++ kernel/sched/Makefile | 1 + kernel/sched/cpufreq_schedutil.c | 3 + kernel/sched/sched.h | 10 + kernel/sched/usf.c | 294 +++++++++++++++++++++ 7 files changed, 402 insertions(+) create mode 100644 kernel/sched/usf.c diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index b555df8..e299418 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -614,3 +614,51 @@ Description: SPURR ticks for cpuX when it was idle. This sysfs interface exposes the number of SPURR ticks for cpuX when it was idle. + +What: /sys/devices/system/cpu/sched_usf + /sys/devices/system/cpu/sched_usf/sched_usf_non_ux_r + /sys/devices/system/cpu/sched_usf/sched_usf_up_l0_r + /sys/devices/system/cpu/sched_usf/sched_usf_down_r +Date: Aug 2020 +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> +Description: User Sensitive Feedback factor auxiliary scheduling which + is providing more utils adjustment settings to the high level + by scenario identification. + sched_usf_non_ux_r: + The ratio of utils is cut down on screen off. The + default value is 0, which no util is adjusted on sugov + calculating utils to select cpufreq. Its range is + [-100 , 0]. If its value falls into [-50, 0), the half + of utils, which calculates cpufreq, shall be cut down. + If its value falls into [-100, -50), only a quarter of + utils are left to continue to calculate cpufreq. + It is expected to be set [-100, 0) once enter into the + identificated scenario, such as listen to music on + screen off, and recover to 0 on out of the scenario, + such as screen on. + + sched_usf_up_l0_r: + The ratio of utils is boost up on screen on. The + default value is 0, which no util is adjusted on sugov + calculates utils to select cpufreq. Its range is [0 , 100]. + If its value falls into (0, 50], a quarter of extra utils, + which calculate cpufreq, shall be added. If its value + falls into (50, 100], the half of extra utils are added + to continue to calculate cpufreq. + It is expected to be set (0, 100] once enter into the + identificated scenario, such as browsing videolet on + screen on, and recover to 0 on out of the scenario, + such as screen off or videolet into background. + + sched_usf_down_r: + The ratio of utils is cut down on screen on. The + default value is 0, which no util is adjusted on sugov + calculating utils to select cpufreq. Its range is + [-100 , 0]. If its value falls into [-50, 0), the half + of utils, which calculate cpufreq, shall be cut down. + If its value falls into [-100, -50), only a quarter of + utils are left to continue to calculate cpufreq. + It is expected to be set [-100, 0) once enter into the + identificated scenario, such as browsing videolet on + screen on, and recover to 0 on out of the scenario, + such as screen off or vidolet into background. diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index e917501..a21c6ad 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -224,6 +224,17 @@ config CPUFREQ_DT_PLATDEV If in doubt, say N. +config SCHED_USF + bool "User Sensitive Factors for Scheduler" + depends on CPU_FREQ_GOV_SCHEDUTIL && FB + help + Select this option to enable the adjustment on the cpufreq with + the user sensitive factors on schedule. It is special for mobile + devices which more power care and quick response requirement on + screen on. + + If unsure, say N. + if X86 source "drivers/cpufreq/Kconfig.x86" endif diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index ed168b0..d5e20b7 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -488,6 +488,41 @@ TRACE_EVENT(sched_process_hang, #endif /* CONFIG_DETECT_HUNG_TASK */ /* + * Tracepoint for tracking tuils be adjusted by USF: + */ +#ifdef CONFIG_SCHED_USF +TRACE_EVENT(sched_usf_adjust_utils, + + TP_PROTO(int cpu_id, int up, int down, int nonux, unsigned long utils), + + TP_ARGS(cpu_id, up, down, nonux, utils), + + TP_STRUCT__entry( + __field(int, cpu_id) + __field(int, up) + __field(int, down) + __field(int, nonux) + __field(unsigned long, utils) + ), + + TP_fast_assign( + __entry->cpu_id = cpu_id; + __entry->up = up; + __entry->down = down; + __entry->nonux = nonux; + __entry->utils = utils; + ), + + TP_printk("cpu_id=%d up=%d down=%d nonux=%d utils=%lu", + __entry->cpu_id, + __entry->up, + __entry->down, + __entry->nonux, + __entry->utils) +); +#endif /* CONFIG_SCHED_USF */ + +/* * Tracks migration of tasks from one runqueue to another. Can be used to * detect if automatic NUMA balancing is bouncing between nodes. */ diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile index 5fc9c9b..58a0e7b 100644 --- a/kernel/sched/Makefile +++ b/kernel/sched/Makefile @@ -36,3 +36,4 @@ obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o obj-$(CONFIG_MEMBARRIER) += membarrier.o obj-$(CONFIG_CPU_ISOLATION) += isolation.o obj-$(CONFIG_PSI) += psi.o +obj-$(CONFIG_SCHED_USF) += usf.o diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 7fbaee2..79a0040 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -289,12 +289,15 @@ unsigned long schedutil_cpu_util(int cpu, unsigned long util_cfs, return min(max, util); } +DEFINE_STATIC_KEY_FALSE(adjust_task_pred_set); static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu) { struct rq *rq = cpu_rq(sg_cpu->cpu); unsigned long util = cpu_util_cfs(rq); unsigned long max = arch_scale_cpu_capacity(sg_cpu->cpu); + if (static_branch_unlikely(&adjust_task_pred_set)) + adjust_task_pred_demand(sg_cpu->cpu, &util, rq); sg_cpu->max = max; sg_cpu->bw_dl = cpu_bw_dl(rq); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 877fb08..496130b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2540,5 +2540,15 @@ static inline bool is_per_cpu_kthread(struct task_struct *p) } #endif +DECLARE_STATIC_KEY_FALSE(adjust_task_pred_set); +#ifdef CONFIG_SCHED_USF +void adjust_task_pred_demand(int cpuid, unsigned long *util, + struct rq *rq); +#else +static inline void adjust_task_pred_demand(int cpuid, + unsigned long *util, struct rq *rq) +{ } +#endif + void swake_up_all_locked(struct swait_queue_head *q); void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait); diff --git a/kernel/sched/usf.c b/kernel/sched/usf.c new file mode 100644 index 0000000..d4d7998 --- /dev/null +++ b/kernel/sched/usf.c @@ -0,0 +1,294 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020 XiaoMi Inc. + * Author: Yang Dongdong <yangdongdong@xiaomi.com> + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See http://www.gnu.org/licenses/gpl-2.0.html for more details. + */ + +#include <linux/module.h> +#include <linux/init.h> +#include <linux/platform_device.h> +#include <linux/kthread.h> +#include <linux/cpu.h> +#include <linux/sysfs.h> +#include <linux/kthread.h> +#include <linux/kobject.h> +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/kallsyms.h> +#include <linux/fb.h> +#include <linux/notifier.h> +#include <trace/events/sched.h> +#include "sched.h" + +#define BOOST_MIN_V -100 +#define BOOST_MAX_V 100 +#define LEVEL_TOP 3 + +#define USF_TAG "[usf_sched]" + +DEFINE_PER_CPU(unsigned long[PID_MAX_DEFAULT], task_hist_nivcsw); + +static struct { + bool is_sched_usf_enabled; + bool is_screen_on; + int sysctl_sched_usf_up_l0; + int sysctl_sched_usf_down; + int sysctl_sched_usf_non_ux; + int usf_up_l0; + int usf_down; + int usf_non_ux; +} usf_vdev; + +void adjust_task_pred_demand(int cpuid, + unsigned long *util, + struct rq *rq) +{ + /* sysctl_sched_latency/sysctl_sched_min_granularity */ + u32 bl_sw_num = 3; + + if (!usf_vdev.is_sched_usf_enabled || !rq || !rq->curr || + (rq->curr->pid >= PID_MAX_DEFAULT)) + return; + + if (usf_vdev.is_screen_on) { + if (rq->curr->nivcsw > + (per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] + + bl_sw_num + 1)) { + (*util) += (*util) >> usf_vdev.usf_up_l0; + } else if (rq->curr->nivcsw < + (per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] + + bl_sw_num - 1) && (rq->nr_running < bl_sw_num)) { + (*util) >>= usf_vdev.usf_down; + } + per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] = + rq->curr->nivcsw; + } else if (rq->curr->mm) { + (*util) >>= usf_vdev.usf_non_ux; + } + + trace_sched_usf_adjust_utils(cpuid, usf_vdev.usf_up_l0, + usf_vdev.usf_down, + usf_vdev.usf_non_ux, *util); +} + +static int usf_lcd_notifier(struct notifier_block *nb, + unsigned long val, void *data) +{ + struct fb_event *evdata = data; + unsigned int blank; + + if (!evdata) + return 0; + + if (val != FB_EVENT_BLANK) + return 0; + + if (evdata->data && val == FB_EVENT_BLANK) { + blank = *(int *)(evdata->data); + + switch (blank) { + case FB_BLANK_POWERDOWN: + usf_vdev.is_screen_on = false; + if (usf_vdev.sysctl_sched_usf_non_ux != 0) + static_branch_enable(&adjust_task_pred_set); + else + static_branch_disable(&adjust_task_pred_set); + + break; + + case FB_BLANK_UNBLANK: + usf_vdev.is_screen_on = true; + if (usf_vdev.sysctl_sched_usf_up_l0 != 0 || + usf_vdev.sysctl_sched_usf_down != 0) + static_branch_enable(&adjust_task_pred_set); + else + static_branch_disable(&adjust_task_pred_set); + break; + default: + break; + } + + usf_vdev.is_sched_usf_enabled = true; + pr_info("%s : usf_vdev.is_screen_on:%b\n", + __func__, usf_vdev.is_screen_on); + } + return NOTIFY_OK; +} + +static struct notifier_block usf_lcd_nb = { + .notifier_call = usf_lcd_notifier, + .priority = INT_MAX, +}; + +static ssize_t store_sched_usf_up_l0_r(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int val = 0; + int ret = 0; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if (val == 0) { + usf_vdev.sysctl_sched_usf_up_l0 = val; + usf_vdev.usf_up_l0 = 0; + } else if ((val > 0) && (val <= BOOST_MAX_V)) { + usf_vdev.sysctl_sched_usf_up_l0 = val; + usf_vdev.usf_up_l0 = LEVEL_TOP - + DIV_ROUND_UP(val, BOOST_MAX_V / 2); + ret = count; + } else { + pr_err(USF_TAG "%d should fall into [%d %d]", + val, 0, BOOST_MAX_V); + ret = -EINVAL; + } + if ((usf_vdev.sysctl_sched_usf_up_l0 == 0) && + (usf_vdev.sysctl_sched_usf_down == 0)) + static_branch_disable(&adjust_task_pred_set); + else + static_branch_enable(&adjust_task_pred_set); + + return ret; +} + +static ssize_t store_sched_usf_down_r(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int val = 0; + int ret = 0; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if ((val >= BOOST_MIN_V) && (val <= 0)) { + usf_vdev.sysctl_sched_usf_down = val; + usf_vdev.usf_down = DIV_ROUND_UP(-val, -BOOST_MIN_V / 2); + ret = count; + } else { + pr_err(USF_TAG "%d should fall into [%d %d]", + val, BOOST_MIN_V, 0); + ret = -EINVAL; + } + if ((usf_vdev.sysctl_sched_usf_up_l0 == 0) && + (usf_vdev.sysctl_sched_usf_down == 0)) + static_branch_disable(&adjust_task_pred_set); + else + static_branch_enable(&adjust_task_pred_set); + + return ret; +} + +static ssize_t store_sched_usf_non_ux_r(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int val = 0; + int ret = 0; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if ((val >= BOOST_MIN_V) && (val <= 0)) { + usf_vdev.sysctl_sched_usf_non_ux = val; + usf_vdev.usf_non_ux = DIV_ROUND_UP(-val, -BOOST_MIN_V / 2); + ret = count; + } else { + pr_err(USF_TAG "%d should fall into [%d %d]", + val, BOOST_MIN_V, 0); + ret = -EINVAL; + } + if (usf_vdev.sysctl_sched_usf_non_ux == 0) + static_branch_disable(&adjust_task_pred_set); + else + static_branch_enable(&adjust_task_pred_set); + + return ret; +} + +#define usf_attr_rw(_name) \ +static struct kobj_attribute _name = \ +__ATTR(_name, 0664, show_##_name, store_##_name) + +#define usf_show_node(_name, _value) \ +static ssize_t show_##_name \ +(struct kobject *kobj, struct kobj_attribute *attr, char *buf) \ +{ \ + return sprintf(buf, "%d", usf_vdev.sysctl_##_value); \ +} + +usf_show_node(sched_usf_up_l0_r, sched_usf_up_l0); +usf_show_node(sched_usf_down_r, sched_usf_down); +usf_show_node(sched_usf_non_ux_r, sched_usf_non_ux); + +usf_attr_rw(sched_usf_up_l0_r); +usf_attr_rw(sched_usf_down_r); +usf_attr_rw(sched_usf_non_ux_r); + +static struct attribute *sched_attrs[] = { + &sched_usf_up_l0_r.attr, + &sched_usf_down_r.attr, + &sched_usf_non_ux_r.attr, + NULL, +}; + +static struct attribute_group sched_attr_group = { + .attrs = sched_attrs, +}; + +static int __init intera_monitor_init(void) +{ + int res = -1; + struct device *dev; + + res = fb_register_client(&usf_lcd_nb); + if (res < 0) { + pr_err("Failed to register usf_lcd_nb!\n"); + return res; + } + + /* + * create a sched_usf in cpu_subsys: + * /sys/devices/system/cpu/sched_usf/... + */ + dev = cpu_subsys.dev_root; + res = sysfs_create_group(&dev->kobj, &sched_attr_group); + if (res) { + fb_unregister_client(&usf_lcd_nb); + return res; + } + static_branch_disable(&adjust_task_pred_set); + + return res; +} + +module_init(intera_monitor_init); + +static void __exit intera_monitor_exit(void) +{ + struct device *dev; + + dev = cpu_subsys.dev_root; + sysfs_remove_group(&dev->kobj, &sched_attr_group); + fb_unregister_client(&usf_lcd_nb); + static_branch_disable(&adjust_task_pred_set); +} + +module_exit(intera_monitor_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("XiaoMi USF SCHED"); +MODULE_AUTHOR("Yang Dongdong <yangdongdong@xiaomi.com>"); -- 2.7.4 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v4] sched: Provide USF for the portable equipment. 2020-08-04 7:50 ` [PATCH v4] sched: " Dongdong Yang @ 2020-08-04 8:03 ` Greg KH 2020-08-04 9:05 ` peterz ` (2 subsequent siblings) 3 siblings, 0 replies; 8+ messages in thread From: Greg KH @ 2020-08-04 8:03 UTC (permalink / raw) To: Dongdong Yang Cc: rjw, viresh.kumar, mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, devel, linux-pm, linux-kernel, yangdongdong, yanziily, rocking On Tue, Aug 04, 2020 at 03:50:35PM +0800, Dongdong Yang wrote: Comments on code stuff only, not if this is actually a valid thing to be doing at all: > --- /dev/null > +++ b/kernel/sched/usf.c > @@ -0,0 +1,294 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2020 XiaoMi Inc. > + * Author: Yang Dongdong <yangdongdong@xiaomi.com> > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > + * See http://www.gnu.org/licenses/gpl-2.0.html for more details. No need for the two paragraph "boiler plate" license text now that you have a SPDX line, please remove them. > + */ > + > +#include <linux/module.h> > +#include <linux/init.h> > +#include <linux/platform_device.h> > +#include <linux/kthread.h> > +#include <linux/cpu.h> > +#include <linux/sysfs.h> > +#include <linux/kthread.h> > +#include <linux/kobject.h> > +#include <linux/module.h> > +#include <linux/kernel.h> > +#include <linux/init.h> > +#include <linux/kallsyms.h> > +#include <linux/fb.h> > +#include <linux/notifier.h> > +#include <trace/events/sched.h> > +#include "sched.h" > + > +#define BOOST_MIN_V -100 > +#define BOOST_MAX_V 100 > +#define LEVEL_TOP 3 > + > +#define USF_TAG "[usf_sched]" Please pr_fmt instead. > + > +DEFINE_PER_CPU(unsigned long[PID_MAX_DEFAULT], task_hist_nivcsw); > + > +static struct { > + bool is_sched_usf_enabled; > + bool is_screen_on; > + int sysctl_sched_usf_up_l0; > + int sysctl_sched_usf_down; > + int sysctl_sched_usf_non_ux; > + int usf_up_l0; > + int usf_down; > + int usf_non_ux; > +} usf_vdev; > + > +void adjust_task_pred_demand(int cpuid, > + unsigned long *util, > + struct rq *rq) > +{ > + /* sysctl_sched_latency/sysctl_sched_min_granularity */ > + u32 bl_sw_num = 3; > + > + if (!usf_vdev.is_sched_usf_enabled || !rq || !rq->curr || > + (rq->curr->pid >= PID_MAX_DEFAULT)) > + return; > + > + if (usf_vdev.is_screen_on) { > + if (rq->curr->nivcsw > > + (per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] > + + bl_sw_num + 1)) { > + (*util) += (*util) >> usf_vdev.usf_up_l0; > + } else if (rq->curr->nivcsw < > + (per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] > + + bl_sw_num - 1) && (rq->nr_running < bl_sw_num)) { > + (*util) >>= usf_vdev.usf_down; > + } > + per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] = > + rq->curr->nivcsw; > + } else if (rq->curr->mm) { > + (*util) >>= usf_vdev.usf_non_ux; > + } > + > + trace_sched_usf_adjust_utils(cpuid, usf_vdev.usf_up_l0, > + usf_vdev.usf_down, > + usf_vdev.usf_non_ux, *util); > +} > + > +static int usf_lcd_notifier(struct notifier_block *nb, > + unsigned long val, void *data) > +{ > + struct fb_event *evdata = data; > + unsigned int blank; > + > + if (!evdata) > + return 0; > + > + if (val != FB_EVENT_BLANK) > + return 0; > + > + if (evdata->data && val == FB_EVENT_BLANK) { > + blank = *(int *)(evdata->data); > + > + switch (blank) { > + case FB_BLANK_POWERDOWN: > + usf_vdev.is_screen_on = false; > + if (usf_vdev.sysctl_sched_usf_non_ux != 0) > + static_branch_enable(&adjust_task_pred_set); > + else > + static_branch_disable(&adjust_task_pred_set); > + > + break; > + > + case FB_BLANK_UNBLANK: > + usf_vdev.is_screen_on = true; > + if (usf_vdev.sysctl_sched_usf_up_l0 != 0 || > + usf_vdev.sysctl_sched_usf_down != 0) > + static_branch_enable(&adjust_task_pred_set); > + else > + static_branch_disable(&adjust_task_pred_set); > + break; > + default: > + break; > + } > + > + usf_vdev.is_sched_usf_enabled = true; > + pr_info("%s : usf_vdev.is_screen_on:%b\n", > + __func__, usf_vdev.is_screen_on); > + } > + return NOTIFY_OK; > +} > + > +static struct notifier_block usf_lcd_nb = { > + .notifier_call = usf_lcd_notifier, > + .priority = INT_MAX, > +}; > + > +static ssize_t store_sched_usf_up_l0_r(struct kobject *kobj, > + struct kobj_attribute *attr, > + const char *buf, size_t count) > +{ > + int val = 0; > + int ret = 0; > + > + ret = kstrtoint(buf, 0, &val); > + if (ret) > + return ret; > + > + if (val == 0) { > + usf_vdev.sysctl_sched_usf_up_l0 = val; > + usf_vdev.usf_up_l0 = 0; > + } else if ((val > 0) && (val <= BOOST_MAX_V)) { > + usf_vdev.sysctl_sched_usf_up_l0 = val; > + usf_vdev.usf_up_l0 = LEVEL_TOP - > + DIV_ROUND_UP(val, BOOST_MAX_V / 2); > + ret = count; > + } else { > + pr_err(USF_TAG "%d should fall into [%d %d]", > + val, 0, BOOST_MAX_V); > + ret = -EINVAL; > + } > + if ((usf_vdev.sysctl_sched_usf_up_l0 == 0) && > + (usf_vdev.sysctl_sched_usf_down == 0)) > + static_branch_disable(&adjust_task_pred_set); > + else > + static_branch_enable(&adjust_task_pred_set); > + > + return ret; > +} > + > +static ssize_t store_sched_usf_down_r(struct kobject *kobj, > + struct kobj_attribute *attr, > + const char *buf, size_t count) > +{ > + int val = 0; > + int ret = 0; > + > + ret = kstrtoint(buf, 0, &val); > + if (ret) > + return ret; > + > + if ((val >= BOOST_MIN_V) && (val <= 0)) { > + usf_vdev.sysctl_sched_usf_down = val; > + usf_vdev.usf_down = DIV_ROUND_UP(-val, -BOOST_MIN_V / 2); > + ret = count; > + } else { > + pr_err(USF_TAG "%d should fall into [%d %d]", > + val, BOOST_MIN_V, 0); > + ret = -EINVAL; > + } > + if ((usf_vdev.sysctl_sched_usf_up_l0 == 0) && > + (usf_vdev.sysctl_sched_usf_down == 0)) > + static_branch_disable(&adjust_task_pred_set); > + else > + static_branch_enable(&adjust_task_pred_set); > + > + return ret; > +} > + > +static ssize_t store_sched_usf_non_ux_r(struct kobject *kobj, > + struct kobj_attribute *attr, > + const char *buf, size_t count) > +{ > + int val = 0; > + int ret = 0; > + > + ret = kstrtoint(buf, 0, &val); > + if (ret) > + return ret; > + > + if ((val >= BOOST_MIN_V) && (val <= 0)) { > + usf_vdev.sysctl_sched_usf_non_ux = val; > + usf_vdev.usf_non_ux = DIV_ROUND_UP(-val, -BOOST_MIN_V / 2); > + ret = count; > + } else { > + pr_err(USF_TAG "%d should fall into [%d %d]", > + val, BOOST_MIN_V, 0); > + ret = -EINVAL; > + } > + if (usf_vdev.sysctl_sched_usf_non_ux == 0) > + static_branch_disable(&adjust_task_pred_set); > + else > + static_branch_enable(&adjust_task_pred_set); > + > + return ret; > +} > + > +#define usf_attr_rw(_name) \ > +static struct kobj_attribute _name = \ > +__ATTR(_name, 0664, show_##_name, store_##_name) __ATTR_RW()? > + > +#define usf_show_node(_name, _value) \ > +static ssize_t show_##_name \ > +(struct kobject *kobj, struct kobj_attribute *attr, char *buf) \ > +{ \ > + return sprintf(buf, "%d", usf_vdev.sysctl_##_value); \ > +} Again do NOT use raw kobjects. > + > +usf_show_node(sched_usf_up_l0_r, sched_usf_up_l0); > +usf_show_node(sched_usf_down_r, sched_usf_down); > +usf_show_node(sched_usf_non_ux_r, sched_usf_non_ux); > + > +usf_attr_rw(sched_usf_up_l0_r); > +usf_attr_rw(sched_usf_down_r); > +usf_attr_rw(sched_usf_non_ux_r); > + > +static struct attribute *sched_attrs[] = { > + &sched_usf_up_l0_r.attr, > + &sched_usf_down_r.attr, > + &sched_usf_non_ux_r.attr, > + NULL, > +}; > + > +static struct attribute_group sched_attr_group = { > + .attrs = sched_attrs, > +}; ATTRIBUTE_GROUPS()? > + > +static int __init intera_monitor_init(void) > +{ > + int res = -1; > + struct device *dev; > + > + res = fb_register_client(&usf_lcd_nb); > + if (res < 0) { > + pr_err("Failed to register usf_lcd_nb!\n"); > + return res; > + } > + > + /* > + * create a sched_usf in cpu_subsys: > + * /sys/devices/system/cpu/sched_usf/... > + */ > + dev = cpu_subsys.dev_root; > + res = sysfs_create_group(&dev->kobj, &sched_attr_group); Do not just tack on random sysfs files to a random struct device that you do not own. That's ripe for big problems. Ugh, that seems to be how others do it too, not nice. Ok, but at the very least, use DEVICE_ATTR_RW() and do not use kobjects, as you will get into problems there. How does userspace know that these new sysfs files have shown up? You never told it about them, so does it just "guess"? thanks, greg k-h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] sched: Provide USF for the portable equipment. 2020-08-04 7:50 ` [PATCH v4] sched: " Dongdong Yang 2020-08-04 8:03 ` Greg KH @ 2020-08-04 9:05 ` peterz 2020-08-04 10:43 ` Qais Yousef 2020-08-04 15:41 ` Randy Dunlap 3 siblings, 0 replies; 8+ messages in thread From: peterz @ 2020-08-04 9:05 UTC (permalink / raw) To: Dongdong Yang Cc: gregkh, rjw, viresh.kumar, mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking On Tue, Aug 04, 2020 at 03:50:35PM +0800, Dongdong Yang wrote: > +What: /sys/devices/system/cpu/sched_usf > + /sys/devices/system/cpu/sched_usf/sched_usf_non_ux_r > + /sys/devices/system/cpu/sched_usf/sched_usf_up_l0_r > + /sys/devices/system/cpu/sched_usf/sched_usf_down_r > +Date: Aug 2020 > +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> > +Description: User Sensitive Feedback factor auxiliary scheduling which > + is providing more utils adjustment settings to the high level > + by scenario identification. > + sched_usf_non_ux_r: > + The ratio of utils is cut down on screen off. The > + default value is 0, which no util is adjusted on sugov > + calculating utils to select cpufreq. Its range is > + [-100 , 0]. If its value falls into [-50, 0), the half > + of utils, which calculates cpufreq, shall be cut down. > + If its value falls into [-100, -50), only a quarter of > + utils are left to continue to calculate cpufreq. > + It is expected to be set [-100, 0) once enter into the > + identificated scenario, such as listen to music on > + screen off, and recover to 0 on out of the scenario, > + such as screen on. > + > + sched_usf_up_l0_r: > + The ratio of utils is boost up on screen on. The > + default value is 0, which no util is adjusted on sugov > + calculates utils to select cpufreq. Its range is [0 , 100]. > + If its value falls into (0, 50], a quarter of extra utils, > + which calculate cpufreq, shall be added. If its value > + falls into (50, 100], the half of extra utils are added > + to continue to calculate cpufreq. > + It is expected to be set (0, 100] once enter into the > + identificated scenario, such as browsing videolet on > + screen on, and recover to 0 on out of the scenario, > + such as screen off or videolet into background. > + > + sched_usf_down_r: > + The ratio of utils is cut down on screen on. The > + default value is 0, which no util is adjusted on sugov > + calculating utils to select cpufreq. Its range is > + [-100 , 0]. If its value falls into [-50, 0), the half > + of utils, which calculate cpufreq, shall be cut down. > + If its value falls into [-100, -50), only a quarter of > + utils are left to continue to calculate cpufreq. > + It is expected to be set [-100, 0) once enter into the > + identificated scenario, such as browsing videolet on > + screen on, and recover to 0 on out of the scenario, > + such as screen off or vidolet into background. I hate the names, they're a bunch of random letters. Maybe if you strip the 'sched_usf' prefix, on account on these files being in a directory by the same name, you'll have additional budget for readable names? Also, I detest 100, 100 sucks. Use something sane, like 128. Also, I'm not at all sure I understand the text. Also, nothing explains how any of this is supposed to be working on an SMP system. Please advice how this makes sense if you have 64 CPUs. My desktop has a screen, so giving feedback should work, no? > diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig > index e917501..a21c6ad 100644 > --- a/drivers/cpufreq/Kconfig > +++ b/drivers/cpufreq/Kconfig > @@ -224,6 +224,17 @@ config CPUFREQ_DT_PLATDEV > > If in doubt, say N. > > +config SCHED_USF > + bool "User Sensitive Factors for Scheduler" 'bool' means this cannot be a module, yet see below. > + depends on CPU_FREQ_GOV_SCHEDUTIL && FB I really hate how this is a special purpose hack for framebuffer. We don't do special purpose hacks like that. So you have something that only 'works' for small machines, small number of tasks, and requires FB. NAK, we don't do that upstream. > + help > + Select this option to enable the adjustment on the cpufreq with > + the user sensitive factors on schedule. It is special for mobile > + devices which more power care and quick response requirement on > + screen on. > + > + If unsure, say N. > + > if X86 > source "drivers/cpufreq/Kconfig.x86" > endif > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h > index ed168b0..d5e20b7 100644 > --- a/include/trace/events/sched.h > +++ b/include/trace/events/sched.h > @@ -488,6 +488,41 @@ TRACE_EVENT(sched_process_hang, > #endif /* CONFIG_DETECT_HUNG_TASK */ > > /* > + * Tracepoint for tracking tuils be adjusted by USF: I can't find 'tuils' in my local dictionary, did you mean tools? And what does the comment try to day? Are USF users supposed to use this tracepoint to close a feedback loop or what? > + */ > +#ifdef CONFIG_SCHED_USF > +TRACE_EVENT(sched_usf_adjust_utils, > + > + TP_PROTO(int cpu_id, int up, int down, int nonux, unsigned long utils), You're using tracepoints wrong, pass in that usf_dev pointer and have the assign do all the work. > + > + TP_ARGS(cpu_id, up, down, nonux, utils), > + > + TP_STRUCT__entry( > + __field(int, cpu_id) > + __field(int, up) > + __field(int, down) > + __field(int, nonux) > + __field(unsigned long, utils) > + ), > + > + TP_fast_assign( > + __entry->cpu_id = cpu_id; > + __entry->up = up; > + __entry->down = down; > + __entry->nonux = nonux; > + __entry->utils = utils; > + ), > + > + TP_printk("cpu_id=%d up=%d down=%d nonux=%d utils=%lu", > + __entry->cpu_id, > + __entry->up, > + __entry->down, > + __entry->nonux, > + __entry->utils) > +); > +#endif /* CONFIG_SCHED_USF */ > + > +/* > * Tracks migration of tasks from one runqueue to another. Can be used to > * detect if automatic NUMA balancing is bouncing between nodes. > */ > diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile > index 5fc9c9b..58a0e7b 100644 > --- a/kernel/sched/Makefile > +++ b/kernel/sched/Makefile > @@ -36,3 +36,4 @@ obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o > obj-$(CONFIG_MEMBARRIER) += membarrier.o > obj-$(CONFIG_CPU_ISOLATION) += isolation.o > obj-$(CONFIG_PSI) += psi.o > +obj-$(CONFIG_SCHED_USF) += usf.o > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index 7fbaee2..79a0040 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -289,12 +289,15 @@ unsigned long schedutil_cpu_util(int cpu, unsigned long util_cfs, > return min(max, util); > } > > +DEFINE_STATIC_KEY_FALSE(adjust_task_pred_set); Missing whitespace, bad name. Also wrong file, all the actual control is in that usf nonsense. > static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu) > { > struct rq *rq = cpu_rq(sg_cpu->cpu); > unsigned long util = cpu_util_cfs(rq); > unsigned long max = arch_scale_cpu_capacity(sg_cpu->cpu); > > + if (static_branch_unlikely(&adjust_task_pred_set)) > + adjust_task_pred_demand(sg_cpu->cpu, &util, rq); Missing whitespace, horrible naming, there is nothing task here, Please explain how it makes sense to have that static branch when !SCHED_USF. > sg_cpu->max = max; > sg_cpu->bw_dl = cpu_bw_dl(rq); > > diff --git a/kernel/sched/usf.c b/kernel/sched/usf.c > new file mode 100644 > index 0000000..d4d7998 > --- /dev/null > +++ b/kernel/sched/usf.c > @@ -0,0 +1,294 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2020 XiaoMi Inc. > + * Author: Yang Dongdong <yangdongdong@xiaomi.com> > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > + * See http://www.gnu.org/licenses/gpl-2.0.html for more details. > + */ > + > +#include <linux/module.h> > +#include <linux/init.h> > +#include <linux/platform_device.h> > +#include <linux/kthread.h> > +#include <linux/cpu.h> > +#include <linux/sysfs.h> > +#include <linux/kthread.h> > +#include <linux/kobject.h> > +#include <linux/module.h> > +#include <linux/kernel.h> > +#include <linux/init.h> > +#include <linux/kallsyms.h> > +#include <linux/fb.h> > +#include <linux/notifier.h> > +#include <trace/events/sched.h> > +#include "sched.h" > + > +#define BOOST_MIN_V -100 > +#define BOOST_MAX_V 100 > +#define LEVEL_TOP 3 > + > +#define USF_TAG "[usf_sched]" > + > +DEFINE_PER_CPU(unsigned long[PID_MAX_DEFAULT], task_hist_nivcsw); Hahahaha, I think not. > + > +static struct { > + bool is_sched_usf_enabled; > + bool is_screen_on; > + int sysctl_sched_usf_up_l0; > + int sysctl_sched_usf_down; > + int sysctl_sched_usf_non_ux; They're not sysctl's. Also letter soup. > + int usf_up_l0; > + int usf_down; > + int usf_non_ux; > +} usf_vdev; > + > +void adjust_task_pred_demand(int cpuid, > + unsigned long *util, > + struct rq *rq) Still a horrible name. > +{ > + /* sysctl_sched_latency/sysctl_sched_min_granularity */ What does this comment want to tell us? > + u32 bl_sw_num = 3; > + > + if (!usf_vdev.is_sched_usf_enabled || !rq || !rq->curr || > + (rq->curr->pid >= PID_MAX_DEFAULT)) Everything after is_sched_usf_enable is nonsense. And even that is dodgy since you had that static branch. You shouldn't be getting here if you're not enabled. > + return; > + > + if (usf_vdev.is_screen_on) { > + if (rq->curr->nivcsw > > + (per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] > + + bl_sw_num + 1)) { That's horrible style, if you'd made a helper function with a descriptive name, you'd have readable code. > + (*util) += (*util) >> usf_vdev.usf_up_l0; So when 'current' has more non-voluntary context switches than it last had on this cpu, we add something to the util. That's dodgy as heck. Esp. without comments. This util is not for current. The best you can say is that current contributes to it. > + } else if (rq->curr->nivcsw < > + (per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] > + + bl_sw_num - 1) && (rq->nr_running < bl_sw_num)) { Another unreadable expression with style issues, and how we're reducing util. > + (*util) >>= usf_vdev.usf_down; > + } > + per_cpu(task_hist_nivcsw, cpuid)[rq->curr->pid] = > + rq->curr->nivcsw; That's hardly a histogram is it. Or did you mean history? > + } else if (rq->curr->mm) { > + (*util) >>= usf_vdev.usf_non_ux; > + } > + > + trace_sched_usf_adjust_utils(cpuid, usf_vdev.usf_up_l0, > + usf_vdev.usf_down, > + usf_vdev.usf_non_ux, *util); > +} > + > +static int usf_lcd_notifier(struct notifier_block *nb, > + unsigned long val, void *data) > +{ > + struct fb_event *evdata = data; > + unsigned int blank; > + > + if (!evdata) > + return 0; > + > + if (val != FB_EVENT_BLANK) > + return 0; > + > + if (evdata->data && val == FB_EVENT_BLANK) { > + blank = *(int *)(evdata->data); > + > + switch (blank) { > + case FB_BLANK_POWERDOWN: > + usf_vdev.is_screen_on = false; > + if (usf_vdev.sysctl_sched_usf_non_ux != 0) > + static_branch_enable(&adjust_task_pred_set); > + else > + static_branch_disable(&adjust_task_pred_set); > + > + break; > + > + case FB_BLANK_UNBLANK: > + usf_vdev.is_screen_on = true; > + if (usf_vdev.sysctl_sched_usf_up_l0 != 0 || > + usf_vdev.sysctl_sched_usf_down != 0) > + static_branch_enable(&adjust_task_pred_set); > + else > + static_branch_disable(&adjust_task_pred_set); > + break; > + default: > + break; > + } > + > + usf_vdev.is_sched_usf_enabled = true; > + pr_info("%s : usf_vdev.is_screen_on:%b\n", > + __func__, usf_vdev.is_screen_on); > + } > + return NOTIFY_OK; > +} *groan*... this is horrific. Flipping the static branch is _very_ expensive. > + > +static int __init intera_monitor_init(void) > +{ > + int res = -1; > + struct device *dev; > + > + res = fb_register_client(&usf_lcd_nb); > + if (res < 0) { > + pr_err("Failed to register usf_lcd_nb!\n"); > + return res; > + } > + > + /* > + * create a sched_usf in cpu_subsys: > + * /sys/devices/system/cpu/sched_usf/... > + */ > + dev = cpu_subsys.dev_root; > + res = sysfs_create_group(&dev->kobj, &sched_attr_group); > + if (res) { > + fb_unregister_client(&usf_lcd_nb); > + return res; > + } > + static_branch_disable(&adjust_task_pred_set); You used DEFINE_STATIC_BRANCH_FALSE, it is already disabled. > + > + return res; > +} > + > +module_init(intera_monitor_init); > + > +static void __exit intera_monitor_exit(void) > +{ > + struct device *dev; > + > + dev = cpu_subsys.dev_root; > + sysfs_remove_group(&dev->kobj, &sched_attr_group); > + fb_unregister_client(&usf_lcd_nb); > + static_branch_disable(&adjust_task_pred_set); > +} > + > +module_exit(intera_monitor_exit); > + > +MODULE_LICENSE("GPL"); > +MODULE_DESCRIPTION("XiaoMi USF SCHED"); > +MODULE_AUTHOR("Yang Dongdong <yangdongdong@xiaomi.com>"); -ENOTAMODULE ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] sched: Provide USF for the portable equipment. 2020-08-04 7:50 ` [PATCH v4] sched: " Dongdong Yang 2020-08-04 8:03 ` Greg KH 2020-08-04 9:05 ` peterz @ 2020-08-04 10:43 ` Qais Yousef [not found] ` <CADhdXfpxWqYEL_cWVtEAxg-3OWgVZuZ5sLzNm6G+k484kG3HRw@mail.gmail.com> 2020-08-04 15:41 ` Randy Dunlap 3 siblings, 1 reply; 8+ messages in thread From: Qais Yousef @ 2020-08-04 10:43 UTC (permalink / raw) To: Dongdong Yang Cc: gregkh, rjw, viresh.kumar, mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking Hi Dongdong On 08/04/20 15:50, Dongdong Yang wrote: > +What: /sys/devices/system/cpu/sched_usf > + /sys/devices/system/cpu/sched_usf/sched_usf_non_ux_r > + /sys/devices/system/cpu/sched_usf/sched_usf_up_l0_r > + /sys/devices/system/cpu/sched_usf/sched_usf_down_r > +Date: Aug 2020 > +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> > +Description: User Sensitive Feedback factor auxiliary scheduling which > + is providing more utils adjustment settings to the high level > + by scenario identification. > + sched_usf_non_ux_r: > + The ratio of utils is cut down on screen off. The > + default value is 0, which no util is adjusted on sugov > + calculating utils to select cpufreq. Its range is > + [-100 , 0]. If its value falls into [-50, 0), the half > + of utils, which calculates cpufreq, shall be cut down. > + If its value falls into [-100, -50), only a quarter of > + utils are left to continue to calculate cpufreq. > + It is expected to be set [-100, 0) once enter into the > + identificated scenario, such as listen to music on > + screen off, and recover to 0 on out of the scenario, > + such as screen on. > + > + sched_usf_up_l0_r: > + The ratio of utils is boost up on screen on. The > + default value is 0, which no util is adjusted on sugov > + calculates utils to select cpufreq. Its range is [0 , 100]. > + If its value falls into (0, 50], a quarter of extra utils, > + which calculate cpufreq, shall be added. If its value > + falls into (50, 100], the half of extra utils are added > + to continue to calculate cpufreq. > + It is expected to be set (0, 100] once enter into the > + identificated scenario, such as browsing videolet on > + screen on, and recover to 0 on out of the scenario, > + such as screen off or videolet into background. > + > + sched_usf_down_r: > + The ratio of utils is cut down on screen on. The > + default value is 0, which no util is adjusted on sugov > + calculating utils to select cpufreq. Its range is > + [-100 , 0]. If its value falls into [-50, 0), the half > + of utils, which calculate cpufreq, shall be cut down. > + If its value falls into [-100, -50), only a quarter of > + utils are left to continue to calculate cpufreq. > + It is expected to be set [-100, 0) once enter into the > + identificated scenario, such as browsing videolet on > + screen on, and recover to 0 on out of the scenario, > + such as screen off or vidolet into background. AFACS you're duplicating util clamp functionality here. You can already use util clamp to boost tasks on screen on, and cap them on screen off. And extra brownie points; you can already use that on android 4.19 and 5.4 kernels (I'm assuming the battery device is android based, sorry). Any reason why util clamp isn't giving you what you want? To cap the system on screen off you need to # Don't allow the util to go above 512 echo 512 > /proc/sys/kernel/sched_util_clamp_min echo 512 > /proc/sys/kernel/sched_util_clamp_max To boost the system on screen on, you need first to lift the capping done above # Allow util to use the full range again echo 1024 > /proc/sys/kernel/sched_util_clamp_min echo 1024 > /proc/sys/kernel/sched_util_clamp_max # This is pseudo C code for_each_important_task(p) { /* * boost the task utilization to start from 512. */ sched_attr attr = { .util_min = 512, .util_max = 1024 }; sched_setattr(p, attr); } /* undo boosting once system has settled down */ for_each_important_task(p) { /* * reset util_min back to 0, or whatever value you want. */ sched_attr attr = { .util_min = 0, .util_max = 1024 }; sched_setattr(p, attr); } There's a cgroup API for util clamp too. Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CADhdXfpxWqYEL_cWVtEAxg-3OWgVZuZ5sLzNm6G+k484kG3HRw@mail.gmail.com>]
* Re: [PATCH v4] sched: Provide USF for the portable equipment. [not found] ` <CADhdXfpxWqYEL_cWVtEAxg-3OWgVZuZ5sLzNm6G+k484kG3HRw@mail.gmail.com> @ 2020-08-05 9:53 ` Qais Yousef [not found] ` <CADhdXfqJhE6F9q2dhnRhZGQvgJ0GSWXG1AmgL9i+rYqAwxvZDw@mail.gmail.com> 0 siblings, 1 reply; 8+ messages in thread From: Qais Yousef @ 2020-08-05 9:53 UTC (permalink / raw) To: Dongdong Yang Cc: Greg KH, rjw, Viresh Kumar, mingo, peterz, juri.lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Benjamin Segall, mgorman, linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking On 08/05/20 03:33, Dongdong Yang wrote: > Appreciate Qais for your above comments. I believe the clamp is very good for > terminal devices per pid or cgroup setting. I really hope it works for the > extended scenario, "screen off", although it has a potential side effect on > "screen on" response because it needs to be recovered at high level with > latency. I set "512" to sched_util_clamp_min and max on screen off for our > developing device with android kernel5.4. However, it still could not > replace sched_usf_non_ux_r from the test result as attachment. The cpufreq > could not go down in time. > Screenshot at 2020-08-05 09:56:38.png Please fix your email client so that it doesn't send in HTML. LKML will reject HTML emails. I can't interpret the numbers in the pictures. Can you help explain what am I looking at? I did see an issue with frequency not capped immediately when the system was busy. I am still trying to debug that. I already fixed one problem related to iowait boost not honouring uclamp requests, I will be posting a patch for this soon. If you have IO heavy workload, then iowait boost will cause schedutil to run at high frequency, and uclamp capping is not applied in that path. Can you trace what happens inside uclamp_rq_util_with() when it's called from sched_cpu_util()? The clamp should be applied quickly, so it's a bug we need to fix. In my case I noticed if I ctrl+Z then `fg`, the cap is applied. My hands are full to look at this soon. So if you can trace it, that'd be great. Can you expand more on your worry for "screen on"? The only latency I see is userspace not being able to set uclamp values quickly. But since it seems you already can set sched_usf_non_ux_r from userspace with acceptable results, then uclamp should be able to cover the same functionality. What am I missing? Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CADhdXfqJhE6F9q2dhnRhZGQvgJ0GSWXG1AmgL9i+rYqAwxvZDw@mail.gmail.com>]
* Re: [PATCH v4] sched: Provide USF for the portable equipment. [not found] ` <CADhdXfqJhE6F9q2dhnRhZGQvgJ0GSWXG1AmgL9i+rYqAwxvZDw@mail.gmail.com> @ 2020-08-05 12:41 ` Qais Yousef 0 siblings, 0 replies; 8+ messages in thread From: Qais Yousef @ 2020-08-05 12:41 UTC (permalink / raw) To: Dongdong Yang Cc: Greg KH, rjw, Viresh Kumar, mingo, peterz, juri.lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Benjamin Segall, mgorman, linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking On 08/05/20 19:13, Dongdong Yang wrote: > Appreciate Qais for your clamp implementation. I would like to add traces > for uclamp_rq_util_with and feedback you if I run into any issues. Thanks. FYI, top posting in LKML is frowned upon. Please put your answer underneath the quoted text. > > The util would not be adjusted as soon as FB screen on notification be > received by USF from kernel level if it is set by sched_usf_non_ux, no > matter whether screen on or off. However, sched_util_clamp_min/max have not > been recovered until user space screen on detection. The screen on response > would not be in time for the sensitive user when many background tasks are > running. Whether the kernel module could also > set sched_util_clamp_min/max? For boosting, are you just changing the sysctl or are you actively using sched_setattr() to boost tasks too? Please have a look at the documentation for the sysctl interface. https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/tree/Documentation/admin-guide/sysctl/kernel.rst?h=sched/core#n1065 In summary, they just control the _allowed_ levels. So you can use it to cap/throttle the maximum performance level the system is running at. But you can't use it to boost the whole system. You must use the sched_setattr() to boost important tasks individually or if all the tasks are in a cgroup you can use that. For cgroup interface there's a caveat. If you want to use it let me know so I can explain how boosting would work there. I advise to use the sched_setattr() interface to target and boost those important tasks only. You can as well be smart and target all the background tasks to cap them via sched_setattr(). In this case you wouldn't have to modify the sysctl_sched_util_clamp_min/max. I don't see uclamp being a suitable interface for in-kernel users. PM_QOS is more suitable in my opinion for in-kernel users if you want to impact the overall system performance. I might have misunderstood what you were saying above. If so, can you please rephrase? Thanks -- Qais Yousef ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4] sched: Provide USF for the portable equipment. 2020-08-04 7:50 ` [PATCH v4] sched: " Dongdong Yang ` (2 preceding siblings ...) 2020-08-04 10:43 ` Qais Yousef @ 2020-08-04 15:41 ` Randy Dunlap 3 siblings, 0 replies; 8+ messages in thread From: Randy Dunlap @ 2020-08-04 15:41 UTC (permalink / raw) To: Dongdong Yang, gregkh, rjw, viresh.kumar, mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman Cc: linux-kernel, devel, linux-pm, yangdongdong, yanziily, rocking On 8/4/20 12:50 AM, Dongdong Yang wrote: > From: Dongdong Yang <yangdongdong@xiaomi.com> > > --- > > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu > index b555df8..e299418 100644 > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu > @@ -614,3 +614,51 @@ Description: SPURR ticks for cpuX when it was idle. > > This sysfs interface exposes the number of SPURR ticks > for cpuX when it was idle. > + > +What: /sys/devices/system/cpu/sched_usf > + /sys/devices/system/cpu/sched_usf/sched_usf_non_ux_r > + /sys/devices/system/cpu/sched_usf/sched_usf_up_l0_r > + /sys/devices/system/cpu/sched_usf/sched_usf_down_r > +Date: Aug 2020 > +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> > +Description: User Sensitive Feedback factor auxiliary scheduling which > + is providing more utils adjustment settings to the high level > + by scenario identification. what is "utils"? > + sched_usf_non_ux_r: > + The ratio of utils is cut down on screen off. The same question. > + default value is 0, which no util is adjusted on sugov what is "sugov"? > + calculating utils to select cpufreq. Its range is > + [-100 , 0]. If its value falls into [-50, 0), the half > + of utils, which calculates cpufreq, shall be cut down. > + If its value falls into [-100, -50), only a quarter of > + utils are left to continue to calculate cpufreq. > + It is expected to be set [-100, 0) once enter into the > + identificated scenario, such as listen to music on ^^^^^^^^^^^^^ not a word. > + screen off, and recover to 0 on out of the scenario, > + such as screen on. > + > + sched_usf_up_l0_r: > + The ratio of utils is boost up on screen on. The > + default value is 0, which no util is adjusted on sugov > + calculates utils to select cpufreq. Its range is [0 , 100]. > + If its value falls into (0, 50], a quarter of extra utils, > + which calculate cpufreq, shall be added. If its value > + falls into (50, 100], the half of extra utils are added > + to continue to calculate cpufreq. > + It is expected to be set (0, 100] once enter into the > + identificated scenario, such as browsing videolet on what is "videolet"? > + screen on, and recover to 0 on out of the scenario, > + such as screen off or videolet into background. > + > + sched_usf_down_r: > + The ratio of utils is cut down on screen on. The > + default value is 0, which no util is adjusted on sugov > + calculating utils to select cpufreq. Its range is > + [-100 , 0]. If its value falls into [-50, 0), the half > + of utils, which calculate cpufreq, shall be cut down. > + If its value falls into [-100, -50), only a quarter of > + utils are left to continue to calculate cpufreq. > + It is expected to be set [-100, 0) once enter into the > + identificated scenario, such as browsing videolet on > + screen on, and recover to 0 on out of the scenario, > + such as screen off or vidolet into background. -- ~Randy ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-08-05 17:35 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-04 7:50 [PATCH v4] Provide USF for the portable equipment Dongdong Yang 2020-08-04 7:50 ` [PATCH v4] sched: " Dongdong Yang 2020-08-04 8:03 ` Greg KH 2020-08-04 9:05 ` peterz 2020-08-04 10:43 ` Qais Yousef [not found] ` <CADhdXfpxWqYEL_cWVtEAxg-3OWgVZuZ5sLzNm6G+k484kG3HRw@mail.gmail.com> 2020-08-05 9:53 ` Qais Yousef [not found] ` <CADhdXfqJhE6F9q2dhnRhZGQvgJ0GSWXG1AmgL9i+rYqAwxvZDw@mail.gmail.com> 2020-08-05 12:41 ` Qais Yousef 2020-08-04 15:41 ` Randy Dunlap
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).