From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
linux-api@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Paul Turner <pjt@google.com>,
Quentin Perret <quentin.perret@arm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Morten Rasmussen <morten.rasmussen@arm.com>,
Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@google.com>,
Joel Fernandes <joelaf@google.com>,
Steve Muckle <smuckle@google.com>,
Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v6 07/16] sched/core: uclamp: Add system default clamps
Date: Tue, 15 Jan 2019 10:15:04 +0000 [thread overview]
Message-ID: <20190115101513.2822-8-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20190115101513.2822-1-patrick.bellasi@arm.com>
Tasks without a user-defined clamp value are considered not clamped
and by default their utilization can have any value in the
[0..SCHED_CAPACITY_SCALE] range.
Tasks with a user-defined clamp value are allowed to request any value
in that range, and we unconditionally enforce the required clamps.
However, a "System Management Software" could be interested in limiting
the range of clamp values allowed for all tasks.
Add a privileged interface to define a system default configuration via:
/proc/sys/kernel/sched_uclamp_util_{min,max}
which works as an unconditional clamp range restriction for all tasks.
If a task specific value is not compliant with the system default range,
it will be forced to the corresponding system default value.
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
The current restriction could be too aggressive since, for example if a
task has a util_min which is higher then the system default max, it
will be forced to the system default min unconditionally.
Let say we have:
Task Clamp: min=30, max=40
System Clamps: min=10, max=20
In principle we should set the task's min=20, since the system allows
boosts up to 20%. In the current implementation, however, since the task
mins exceed the system max, we just go for task min=10.
We should probably better restrict util_min to the maximum system
default value, but that would make the code more complex since it
required to track a cross clamp_id dependency.
Let's keep this as a possible future extension whenever we should really
see the need for it.
Changes in v6:
Others:
- wholesale s/group/bucket/
- make use of the bit_for() macro
---
include/linux/sched.h | 5 ++
include/linux/sched/sysctl.h | 11 +++
kernel/sched/core.c | 137 ++++++++++++++++++++++++++++++++++-
kernel/sysctl.c | 16 ++++
4 files changed, 166 insertions(+), 3 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 84294925d006..c8f391d1cdc5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -625,6 +625,11 @@ struct uclamp_se {
unsigned int bucket_id : bits_per(UCLAMP_BUCKETS);
unsigned int mapped : 1;
unsigned int active : 1;
+ /* Clamp bucket and value actually used by a RUNNABLE task */
+ struct {
+ unsigned int value : bits_per(SCHED_CAPACITY_SCALE);
+ unsigned int bucket_id : bits_per(UCLAMP_BUCKETS);
+ } effective;
};
#endif /* CONFIG_UCLAMP_TASK */
diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index a9c32daeb9d8..445fb54eaeff 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -56,6 +56,11 @@ int sched_proc_update_handler(struct ctl_table *table, int write,
extern unsigned int sysctl_sched_rt_period;
extern int sysctl_sched_rt_runtime;
+#ifdef CONFIG_UCLAMP_TASK
+extern unsigned int sysctl_sched_uclamp_util_min;
+extern unsigned int sysctl_sched_uclamp_util_max;
+#endif
+
#ifdef CONFIG_CFS_BANDWIDTH
extern unsigned int sysctl_sched_cfs_bandwidth_slice;
#endif
@@ -75,6 +80,12 @@ extern int sched_rt_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos);
+#ifdef CONFIG_UCLAMP_TASK
+extern int sched_uclamp_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos);
+#endif
+
extern int sysctl_numa_balancing(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b7ac516a70be..d1ea5825501a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -731,6 +731,23 @@ static void set_load_weight(struct task_struct *p, bool update_load)
static DEFINE_MUTEX(uclamp_mutex);
/*
+ * Minimum utilization for FAIR tasks
+ * default: 0
+ */
+unsigned int sysctl_sched_uclamp_util_min;
+
+/*
+ * Maximum utilization for FAIR tasks
+ * default: 1024
+ */
+unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
+
+/*
+ * Tasks specific clamp values are required to be within this range
+ */
+static struct uclamp_se uclamp_default[UCLAMP_CNT];
+
+/**
* Reference count utilization clamp buckets
* @value: the utilization "clamp value" tracked by this clamp bucket
* @se_count: the number of scheduling entities using this "clamp value"
@@ -827,6 +844,72 @@ static inline void uclamp_cpu_update(struct rq *rq, unsigned int clamp_id,
WRITE_ONCE(rq->uclamp[clamp_id].value, max_value);
}
+/*
+ * The effective clamp bucket index of a task depends on, by increasing
+ * priority:
+ * - the task specific clamp value, explicitly requested from userspace
+ * - the system default clamp value, defined by the sysadmin
+ *
+ * As a side effect, update the task's effective value:
+ * task_struct::uclamp::effective::value
+ * to represent the clamp value of the task effective bucket index.
+ */
+static inline void
+uclamp_effective_get(struct task_struct *p, unsigned int clamp_id,
+ unsigned int *clamp_value, unsigned int *bucket_id)
+{
+ /* Task specific clamp value */
+ *clamp_value = p->uclamp[clamp_id].value;
+ *bucket_id = p->uclamp[clamp_id].bucket_id;
+
+ /* System default restriction */
+ if (unlikely(*clamp_value < uclamp_default[UCLAMP_MIN].value ||
+ *clamp_value > uclamp_default[UCLAMP_MAX].value)) {
+ /* Keep it simple: unconditionally enforce system defaults */
+ *clamp_value = uclamp_default[clamp_id].value;
+ *bucket_id = uclamp_default[clamp_id].bucket_id;
+ }
+}
+
+static inline void
+uclamp_effective_assign(struct task_struct *p, unsigned int clamp_id)
+{
+ unsigned int clamp_value, bucket_id;
+
+ uclamp_effective_get(p, clamp_id, &clamp_value, &bucket_id);
+
+ p->uclamp[clamp_id].effective.value = clamp_value;
+ p->uclamp[clamp_id].effective.bucket_id = bucket_id;
+}
+
+static inline unsigned int uclamp_effective_bucket_id(struct task_struct *p,
+ unsigned int clamp_id)
+{
+ unsigned int clamp_value, bucket_id;
+
+ /* Task currently refcounted: use back-annotate effective value */
+ if (p->uclamp[clamp_id].active)
+ return p->uclamp[clamp_id].effective.bucket_id;
+
+ uclamp_effective_get(p, clamp_id, &clamp_value, &bucket_id);
+
+ return bucket_id;
+}
+
+static unsigned int uclamp_effective_value(struct task_struct *p,
+ unsigned int clamp_id)
+{
+ unsigned int clamp_value, bucket_id;
+
+ /* Task currently refcounted: use back-annotate effective value */
+ if (p->uclamp[clamp_id].active)
+ return p->uclamp[clamp_id].effective.value;
+
+ uclamp_effective_get(p, clamp_id, &clamp_value, &bucket_id);
+
+ return clamp_value;
+}
+
/*
* When a task is enqueued on a CPU's rq, the clamp bucket currently defined by
* the task's uclamp::bucket_id is reference counted on that CPU. This also
@@ -843,14 +926,15 @@ static inline void uclamp_cpu_inc_id(struct task_struct *p, struct rq *rq,
if (unlikely(!p->uclamp[clamp_id].mapped))
return;
+ uclamp_effective_assign(p, clamp_id);
- bucket_id = p->uclamp[clamp_id].bucket_id;
+ bucket_id = uclamp_effective_bucket_id(p, clamp_id);
p->uclamp[clamp_id].active = true;
rq->uclamp[clamp_id].bucket[bucket_id].tasks++;
/* Reset clamp holds on idle exit */
- tsk_clamp = p->uclamp[clamp_id].value;
+ tsk_clamp = uclamp_effective_value(p, clamp_id);
uclamp_idle_reset(rq, clamp_id, tsk_clamp);
/* CPU's clamp buckets track the max effective clamp value */
@@ -880,7 +964,7 @@ static inline void uclamp_cpu_dec_id(struct task_struct *p, struct rq *rq,
if (unlikely(!p->uclamp[clamp_id].mapped))
return;
- bucket_id = p->uclamp[clamp_id].bucket_id;
+ bucket_id = uclamp_effective_bucket_id(p, clamp_id);
p->uclamp[clamp_id].active = false;
SCHED_WARN_ON(!rq->uclamp[clamp_id].bucket[bucket_id].tasks);
@@ -1068,6 +1152,50 @@ static void uclamp_bucket_inc(struct task_struct *p, struct uclamp_se *uc_se,
uc_se->mapped = true;
}
+int sched_uclamp_handler(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ int old_min, old_max;
+ int result = 0;
+
+ mutex_lock(&uclamp_mutex);
+
+ old_min = sysctl_sched_uclamp_util_min;
+ old_max = sysctl_sched_uclamp_util_max;
+
+ result = proc_dointvec(table, write, buffer, lenp, ppos);
+ if (result)
+ goto undo;
+ if (!write)
+ goto done;
+
+ if (sysctl_sched_uclamp_util_min > sysctl_sched_uclamp_util_max ||
+ sysctl_sched_uclamp_util_max > SCHED_CAPACITY_SCALE) {
+ result = -EINVAL;
+ goto undo;
+ }
+
+ if (old_min != sysctl_sched_uclamp_util_min) {
+ uclamp_bucket_inc(NULL, &uclamp_default[UCLAMP_MIN],
+ UCLAMP_MIN, sysctl_sched_uclamp_util_min);
+ }
+ if (old_max != sysctl_sched_uclamp_util_max) {
+ uclamp_bucket_inc(NULL, &uclamp_default[UCLAMP_MAX],
+ UCLAMP_MAX, sysctl_sched_uclamp_util_max);
+ }
+ goto done;
+
+undo:
+ sysctl_sched_uclamp_util_min = old_min;
+ sysctl_sched_uclamp_util_max = old_max;
+
+done:
+ mutex_unlock(&uclamp_mutex);
+
+ return result;
+}
+
static int __setscheduler_uclamp(struct task_struct *p,
const struct sched_attr *attr)
{
@@ -1151,6 +1279,9 @@ static void __init init_uclamp(void)
for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id) {
uc_se = &init_task.uclamp[clamp_id];
uclamp_bucket_inc(NULL, uc_se, clamp_id, uclamp_none(clamp_id));
+
+ uc_se = &uclamp_default[clamp_id];
+ uclamp_bucket_inc(NULL, uc_se, clamp_id, uclamp_none(clamp_id));
}
}
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ba4d9e85feb8..b0fa4a883999 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -446,6 +446,22 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = sched_rr_handler,
},
+#ifdef CONFIG_UCLAMP_TASK
+ {
+ .procname = "sched_uclamp_util_min",
+ .data = &sysctl_sched_uclamp_util_min,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = sched_uclamp_handler,
+ },
+ {
+ .procname = "sched_uclamp_util_max",
+ .data = &sysctl_sched_uclamp_util_max,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = sched_uclamp_handler,
+ },
+#endif
#ifdef CONFIG_SCHED_AUTOGROUP
{
.procname = "sched_autogroup_enabled",
--
2.19.2
next prev parent reply other threads:[~2019-01-15 10:17 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-15 10:14 [PATCH v6 00/16] Add utilization clamping support Patrick Bellasi
2019-01-15 10:14 ` [PATCH v6 01/16] sched/core: Allow sched_setattr() to use the current policy Patrick Bellasi
2019-01-25 13:56 ` Alessio Balsini
2019-01-15 10:14 ` [PATCH v6 02/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 03/16] sched/core: uclamp: Map TASK's clamp values into CPU's clamp buckets Patrick Bellasi
2019-01-21 10:15 ` Peter Zijlstra
2019-01-21 12:27 ` Patrick Bellasi
2019-01-21 12:51 ` Peter Zijlstra
2019-01-21 15:05 ` Peter Zijlstra
2019-01-21 15:34 ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 04/16] sched/core: uclamp: Add CPU's clamp buckets refcounting Patrick Bellasi
2019-01-21 14:59 ` Peter Zijlstra
2019-01-21 15:23 ` Patrick Bellasi
2019-01-21 16:12 ` Peter Zijlstra
2019-01-21 16:33 ` Patrick Bellasi
2019-01-22 9:45 ` Peter Zijlstra
2019-01-22 10:31 ` Patrick Bellasi
2019-01-21 15:17 ` Peter Zijlstra
2019-01-21 15:54 ` Patrick Bellasi
2019-01-22 10:03 ` Peter Zijlstra
2019-01-22 10:53 ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 05/16] sched/core: uclamp: Update CPU's refcount on clamp changes Patrick Bellasi
2019-01-21 15:33 ` Peter Zijlstra
2019-01-21 15:44 ` Patrick Bellasi
2019-01-22 9:37 ` Peter Zijlstra
2019-01-22 10:43 ` Patrick Bellasi
2019-01-22 13:28 ` Peter Zijlstra
2019-01-22 14:01 ` Patrick Bellasi
2019-01-22 14:57 ` Peter Zijlstra
2019-01-22 15:33 ` Patrick Bellasi
2019-01-23 9:16 ` Peter Zijlstra
2019-01-23 14:14 ` Patrick Bellasi
2019-01-23 18:59 ` Peter Zijlstra
2019-01-24 11:21 ` Patrick Bellasi
2019-01-24 12:38 ` Peter Zijlstra
2019-01-15 10:15 ` [PATCH v6 06/16] sched/core: uclamp: Enforce last task UCLAMP_MAX Patrick Bellasi
2019-01-15 10:15 ` Patrick Bellasi [this message]
2019-01-22 13:56 ` [PATCH v6 07/16] sched/core: uclamp: Add system default clamps Peter Zijlstra
2019-01-22 14:43 ` Patrick Bellasi
2019-01-22 15:13 ` Peter Zijlstra
2019-01-22 15:41 ` Patrick Bellasi
2019-01-23 9:22 ` Peter Zijlstra
2019-01-23 14:19 ` Patrick Bellasi
2019-01-23 19:10 ` Peter Zijlstra
2019-01-15 10:15 ` [PATCH v6 08/16] sched/cpufreq: uclamp: Add utilization clamping for FAIR tasks Patrick Bellasi
2019-01-22 10:37 ` Rafael J. Wysocki
2019-01-22 11:02 ` Patrick Bellasi
2019-01-22 11:04 ` Rafael J. Wysocki
2019-01-22 11:27 ` Patrick Bellasi
2019-01-22 15:21 ` Peter Zijlstra
2019-01-22 15:45 ` Patrick Bellasi
2019-01-22 17:13 ` Peter Zijlstra
2019-01-22 18:18 ` Patrick Bellasi
2019-01-23 9:52 ` Peter Zijlstra
2019-01-23 14:24 ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 09/16] sched/cpufreq: uclamp: Add utilization clamping for RT tasks Patrick Bellasi
2019-01-22 12:30 ` Quentin Perret
2019-01-22 12:37 ` Patrick Bellasi
2019-01-23 10:28 ` Peter Zijlstra
2019-01-23 14:33 ` Patrick Bellasi
2019-01-23 10:49 ` Peter Zijlstra
2019-01-23 14:40 ` Patrick Bellasi
2019-01-23 20:11 ` Peter Zijlstra
2019-01-24 12:30 ` Patrick Bellasi
2019-01-24 12:38 ` Patrick Bellasi
2019-01-24 15:12 ` Peter Zijlstra
2019-01-24 16:00 ` Patrick Bellasi
2019-01-24 15:31 ` Peter Zijlstra
2019-01-24 16:14 ` Patrick Bellasi
2019-01-24 15:33 ` Peter Zijlstra
2019-01-24 15:15 ` Peter Zijlstra
2019-01-24 16:05 ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 10/16] sched/core: Add uclamp_util_with() Patrick Bellasi
2019-01-23 13:33 ` Peter Zijlstra
2019-01-23 14:51 ` Patrick Bellasi
2019-01-23 19:22 ` Peter Zijlstra
2019-01-15 10:15 ` [PATCH v6 11/16] sched/fair: Add uclamp support to energy_compute() Patrick Bellasi
2019-01-22 12:13 ` Quentin Perret
2019-01-22 12:45 ` Patrick Bellasi
2019-01-22 13:29 ` Quentin Perret
2019-01-22 14:26 ` Patrick Bellasi
2019-01-22 14:39 ` Quentin Perret
2019-01-22 15:01 ` Patrick Bellasi
2019-01-22 15:14 ` Quentin Perret
2019-01-15 10:15 ` [PATCH v6 12/16] sched/core: uclamp: Extend CPU's cgroup controller Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 13/16] sched/core: uclamp: Propagate parent clamps Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 14/16] sched/core: uclamp: Map TG's clamp values into CPU's clamp buckets Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 15/16] sched/core: uclamp: Use TG's clamps to restrict TASK's clamps Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 16/16] sched/core: uclamp: Update CPU's refcount on TG's clamp changes Patrick Bellasi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190115101513.2822-8-patrick.bellasi@arm.com \
--to=patrick.bellasi@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=joelaf@google.com \
--cc=juri.lelli@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=quentin.perret@arm.com \
--cc=rafael.j.wysocki@intel.com \
--cc=smuckle@google.com \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=tkjos@google.com \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).