LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Paul Turner <pjt@google.com>,
	Quentin Perret <quentin.perret@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Steve Muckle <smuckle@google.com>,
	Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v4 16/16] sched/cpufreq: uclamp: add utilization clamping for RT tasks
Date: Tue, 28 Aug 2018 14:53:24 +0100
Message-ID: <20180828135324.21976-17-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20180828135324.21976-1-patrick.bellasi@arm.com>

Currently schedutil enforces a maximum frequency when RT tasks are
RUNNABLE.  Such a mandatory policy can be made more tunable from
userspace thus allowing for example to define a max frequency which is
still reasonable for the execution of a specific RT workload. This
will contribute to make the RT class more friendly for power/energy
sensitive use-cases.

This patch extends the usage of util_{min,max} to the RT scheduling
class. Whenever a task in this class is RUNNABLE, the util required is
defined by its task specific clamp value. However, we still want to run
at maximum capacity RT tasks which:
 - do not have task specific clamp values
 - run either in the root task group or an autogroup

Let's add uclamp_default_perf, a special set of clamp value to be used
for tasks that require maximum performance. This set of clamps are then
used whenever the above conditions matches for an RT task being enqueued
on a CPU.

Since utilization clamping applies now to both CFS and RT tasks, we
clamp the combined utilization of these two classes.
This approach, contrary to combining individually clamped utilizations,
is more power efficient. Indeed, it selects lower frequencies when we
have both RT and CFS clamped tasks.
However, it could also affect performance of the lower priority CFS
class, since the CFS's minimum utilization clamp could be completely
eclipsed by the RT workloads.

The IO wait boost value also is subject to clamping for RT tasks.
This is to ensure that RT tasks as well as CFS ones are always subject
to the set of current utilization clamping constraints.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org

---
Changes in v4:
 Message-ID: <20180813150112.GE2605@e110439-lin>
 - remove UCLAMP_SCHED_CLASS policy since we do not have in the current
   implementation a proper per-sched_class clamp tracking support
 Message-ID: <20180809155551.bp46sixk4u3ilcnh@queper01-lin>
 - add default boost for not clamped RT tasks
 Others:
 - rebased on v4.19-rc1

Changes in v3:
 - rebased on tip/sched/core
Changes in v2:
 - rebased on v4.18-rc4
---
 kernel/sched/core.c              | 30 ++++++++++++++++++++++++------
 kernel/sched/cpufreq_schedutil.c | 22 ++++++++++++----------
 kernel/sched/rt.c                |  4 ++++
 3 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f71e15eaf152..9761457af1ac 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -741,6 +741,7 @@ unsigned int sysctl_sched_uclamp_util_min;
 unsigned int sysctl_sched_uclamp_util_max = 100;
 
 static struct uclamp_se uclamp_default[UCLAMP_CNT];
+static struct uclamp_se uclamp_default_perf[UCLAMP_CNT];
 
 /**
  * uclamp_map: reference counts a utilization "clamp value"
@@ -1052,10 +1053,15 @@ static inline int uclamp_task_group_id(struct task_struct *p, int clamp_id)
 	 */
 	if (unclamped && (task_group_is_autogroup(task_group(p)) ||
 			  task_group(p) == &root_task_group)) {
-		p->uclamp[clamp_id].effective.value =
-			uclamp_default[clamp_id].value;
 
-		return uclamp_default[clamp_id].group_id;
+		/* Unclamped RT tasks: max perfs by default */
+		uc_se = task_has_rt_policy(p)
+			? &uclamp_default_perf[clamp_id]
+			: &uclamp_default[clamp_id];
+
+		p->uclamp[clamp_id].effective.value = uc_se->value;
+
+		return uc_se->group_id;
 	}
 
 	/* Use TG's clamp value to limit task specific values */
@@ -1069,10 +1075,15 @@ static inline int uclamp_task_group_id(struct task_struct *p, int clamp_id)
 #else
 	/* By default, all tasks get the system default clamp value */
 	if (unclamped) {
-		p->uclamp[clamp_id].effective.value =
-			uclamp_default[clamp_id].value;
 
-		return uclamp_default[clamp_id].group_id;
+		/* Unclamped RT tasks: max perfs by default */
+		uc_se = task_has_rt_policy(p)
+			? &uclamp_default_perf[clamp_id]
+			: &uclamp_default[clamp_id];
+
+		p->uclamp[clamp_id].effective.value = uc_se->value;
+
+		return uc_se->group_id;
 	}
 #endif
 
@@ -1761,6 +1772,13 @@ static void __init init_uclamp(void)
 		uc_se->group_id = UCLAMP_NOT_VALID;
 		uclamp_group_get(NULL, NULL, clamp_id, 0, uc_se,
 				 uclamp_none(clamp_id));
+
+		/* Init max perf clamps: default for RT tasks */
+		uc_se = &uclamp_default_perf[clamp_id];
+		uc_se->group_id = UCLAMP_NOT_VALID;
+		uclamp_group_get(NULL, NULL, clamp_id, 0, uc_se,
+				 uclamp_none(UCLAMP_MAX));
+
 	}
 }
 
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 949082555ee8..8a2d12a691eb 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -205,7 +205,10 @@ static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu)
 	sg_cpu->max = max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu);
 	sg_cpu->bw_dl = cpu_bw_dl(rq);
 
-	if (rt_rq_is_runnable(&rq->rt))
+	util = rt_rq_is_runnable(&rq->rt)
+		? uclamp_util(rq, SCHED_CAPACITY_SCALE)
+		: cpu_util_rt(rq);
+	if (unlikely(util >= max))
 		return max;
 
 	/*
@@ -223,13 +226,14 @@ static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu)
 	 * utilization (PELT windows are synchronized) we can directly add them
 	 * to obtain the CPU's actual utilization.
 	 *
-	 * CFS utilization can be boosted or capped, depending on utilization
-	 * clamp constraints configured for currently RUNNABLE tasks.
+	 * CFS and RT utilizations can be boosted or capped, depending on
+	 * utilization constraints enforce by currently RUNNABLE tasks.
 	 */
-	util = cpu_util_cfs(rq);
+	util += cpu_util_cfs(rq);
 	if (util)
 		util = uclamp_util(rq, util);
-	util += cpu_util_rt(rq);
+	if (unlikely(util >= max))
+		return max;
 
 	/*
 	 * We do not make cpu_util_dl() a permanent part of this sum because we
@@ -333,13 +337,11 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
 	 *
 	 * Since DL tasks have a much more advanced bandwidth control, it's
 	 * safe to assume that IO boost does not apply to those tasks.
-	 * Instead, since RT tasks are not utiliation clamped, we don't want
-	 * to apply clamping on IO boost while there is blocked RT
-	 * utilization.
+	 * Instead, for CFS and RT tasks we clamp the IO boost max value
+	 * considering the current constraints for the CPU.
 	 */
 	max_boost = sg_cpu->iowait_boost_max;
-	if (!cpu_util_rt(cpu_rq(sg_cpu->cpu)))
-		max_boost = uclamp_util(cpu_rq(sg_cpu->cpu), max_boost);
+	max_boost = uclamp_util(cpu_rq(sg_cpu->cpu), max_boost);
 
 	/* Double the boost at each request */
 	if (sg_cpu->iowait_boost) {
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2e2955a8cf8f..06ec33467dd9 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2404,6 +2404,10 @@ const struct sched_class rt_sched_class = {
 	.switched_to		= switched_to_rt,
 
 	.update_curr		= update_curr_rt,
+
+#ifdef CONFIG_UCLAMP_TASK
+	.uclamp_enabled		= 1,
+#endif
 };
 
 #ifdef CONFIG_RT_GROUP_SCHED
-- 
2.18.0


      parent reply index

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-28 13:53 [PATCH v4 00/16] Add utilization clamping support Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 01/16] sched/core: uclamp: extend sched_setattr to support utilization clamping Patrick Bellasi
2018-09-05 11:01   ` Juri Lelli
2018-08-28 13:53 ` [PATCH v4 02/16] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups Patrick Bellasi
2018-09-05 10:45   ` Juri Lelli
2018-09-06 13:48     ` Patrick Bellasi
2018-09-06 14:13       ` Juri Lelli
2018-09-06  8:17   ` Juri Lelli
2018-09-06 14:00     ` Patrick Bellasi
2018-09-08 23:47   ` Suren Baghdasaryan
2018-09-12 10:32     ` Patrick Bellasi
2018-09-12 13:49   ` Peter Zijlstra
2018-09-12 15:56     ` Patrick Bellasi
2018-09-12 16:12       ` Peter Zijlstra
2018-09-12 17:35         ` Patrick Bellasi
2018-09-12 17:42           ` Peter Zijlstra
2018-09-12 17:52             ` Patrick Bellasi
2018-09-13 19:14               ` Peter Zijlstra
2018-09-14  8:51                 ` Patrick Bellasi
2018-09-12 16:24   ` Peter Zijlstra
2018-09-12 17:42     ` Patrick Bellasi
2018-09-13 19:20       ` Peter Zijlstra
2018-09-14  8:47         ` Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 03/16] sched/core: uclamp: add CPU's clamp groups accounting Patrick Bellasi
2018-09-12 17:34   ` Peter Zijlstra
2018-09-12 17:44     ` Patrick Bellasi
2018-09-13 19:12   ` Peter Zijlstra
2018-09-14  9:07     ` Patrick Bellasi
2018-09-14 11:52       ` Peter Zijlstra
2018-09-14 13:41         ` Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 04/16] sched/core: uclamp: update CPU's refcount on clamp changes Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 05/16] sched/core: uclamp: enforce last task UCLAMP_MAX Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 06/16] sched/cpufreq: uclamp: add utilization clamping for FAIR tasks Patrick Bellasi
2018-09-14  9:32   ` Peter Zijlstra
2018-09-14 13:19     ` Patrick Bellasi
2018-09-14 13:36       ` Peter Zijlstra
2018-09-14 13:57         ` Patrick Bellasi
2018-09-27 10:23           ` Quentin Perret
2018-08-28 13:53 ` [PATCH v4 07/16] sched/core: uclamp: extend cpu's cgroup controller Patrick Bellasi
2018-08-28 18:29   ` Randy Dunlap
2018-08-29  8:53     ` Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 08/16] sched/core: uclamp: propagate parent clamps Patrick Bellasi
2018-09-09  3:02   ` Suren Baghdasaryan
2018-09-12 12:51     ` Patrick Bellasi
2018-09-12 15:56       ` Suren Baghdasaryan
2018-09-11 15:18   ` Tejun Heo
2018-09-11 16:26     ` Patrick Bellasi
2018-09-11 16:28       ` Tejun Heo
2018-08-28 13:53 ` [PATCH v4 09/16] sched/core: uclamp: map TG's clamp values into CPU's clamp groups Patrick Bellasi
2018-09-09 18:52   ` Suren Baghdasaryan
2018-09-12 14:19     ` Patrick Bellasi
2018-09-12 15:53       ` Suren Baghdasaryan
2018-08-28 13:53 ` [PATCH v4 10/16] sched/core: uclamp: use TG's clamps to restrict Task's clamps Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 11/16] sched/core: uclamp: add system default clamps Patrick Bellasi
2018-09-10 16:20   ` Suren Baghdasaryan
2018-09-11 16:46     ` Patrick Bellasi
2018-09-11 19:25       ` Suren Baghdasaryan
2018-08-28 13:53 ` [PATCH v4 12/16] sched/core: uclamp: update CPU's refcount on TG's clamp changes Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 13/16] sched/core: uclamp: use percentage clamp values Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 14/16] sched/core: uclamp: request CAP_SYS_ADMIN by default Patrick Bellasi
2018-09-04 13:47   ` Juri Lelli
2018-09-06 14:40     ` Patrick Bellasi
2018-09-06 14:59       ` Juri Lelli
2018-09-06 17:21         ` Patrick Bellasi
2018-09-14 11:10       ` Peter Zijlstra
2018-09-14 14:07         ` Patrick Bellasi
2018-09-14 14:28           ` Peter Zijlstra
2018-09-17 12:27             ` Patrick Bellasi
2018-09-21  9:13               ` Peter Zijlstra
2018-09-24 15:14                 ` Patrick Bellasi
2018-09-24 15:56                   ` Peter Zijlstra
2018-09-24 17:23                     ` Patrick Bellasi
2018-09-24 16:26                   ` Peter Zijlstra
2018-09-24 17:19                     ` Patrick Bellasi
2018-09-25 15:49                   ` Peter Zijlstra
2018-09-26 10:43                     ` Patrick Bellasi
2018-09-27 10:00                     ` Quentin Perret
2018-09-26 17:51                 ` Patrick Bellasi
2018-08-28 13:53 ` [PATCH v4 15/16] sched/core: uclamp: add clamp group discretization support Patrick Bellasi
2018-08-28 13:53 ` Patrick Bellasi [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180828135324.21976-17-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=quentin.perret@arm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=smuckle@google.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git