All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Paul Turner <pjt@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@android.com>,
	Joel Fernandes <joelaf@google.com>
Subject: [PATCH v2 3/4] sched/fair: use util_est in LB and WU paths
Date: Tue,  5 Dec 2017 17:10:17 +0000	[thread overview]
Message-ID: <20171205171018.9203-4-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20171205171018.9203-1-patrick.bellasi@arm.com>

When the scheduler looks at the CPU utilization, the current PELT value
for a CPU is returned straight away. In certain scenarios this can have
undesired side effects on task placement.

For example, since the task utilization is decayed at wakeup time, when
a long sleeping big task is enqueued it does not add immediately a
significant contribution to the target CPU.
As a result we generate a race condition where other tasks can be placed
on the same CPU while is still considered relatively empty.

In order to reduce these kind of race conditions, this patch introduces the
required support to integrate the usage of the CPU's estimated utilization
in cpu_util_wake as well as in update_sg_lb_stats.

The estimated utilization of a CPU is defined to be the maximum between
its PELT's utilization and the sum of the estimated utilization of the
tasks currently RUNNABLE on that CPU.
This allows to properly represent the expected utilization of a CPU which,
for example, has just got a big task running since a long sleep
period.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Reviewed-by: Brendan Jackman <brendan.jackman@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Paul Turner <pjt@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org

---
Changes v1->v2:
 - rebase on top of v4.15-rc2
 - tested that overhauled PELT code does not affect the util_est
---
 kernel/sched/fair.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 68 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d8f3ed71010b..373d631efa91 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6306,6 +6306,41 @@ static unsigned long cpu_util(int cpu)
 	return (util >= capacity) ? capacity : util;
 }
 
+/**
+ * cpu_util_est: estimated utilization for the specified CPU
+ * @cpu: the CPU to get the estimated utilization for
+ *
+ * The estimated utilization of a CPU is defined to be the maximum between its
+ * PELT's utilization and the sum of the estimated utilization of the tasks
+ * currently RUNNABLE on that CPU.
+ *
+ * This allows to properly represent the expected utilization of a CPU which
+ * has just got a big task running since a long sleep period. At the same time
+ * however it preserves the benefits of the "blocked load" in describing the
+ * potential for other tasks waking up on the same CPU.
+ *
+ * Return: the estimated utlization for the specified CPU
+ */
+static inline unsigned long cpu_util_est(int cpu)
+{
+	unsigned long util, util_est;
+	unsigned long capacity;
+	struct cfs_rq *cfs_rq;
+
+	if (!sched_feat(UTIL_EST))
+		return cpu_util(cpu);
+
+	cfs_rq = &cpu_rq(cpu)->cfs;
+	util = cfs_rq->avg.util_avg;
+	util_est = cfs_rq->util_est_runnable;
+	util_est = max(util, util_est);
+
+	capacity = capacity_orig_of(cpu);
+	util_est = min(util_est, capacity);
+
+	return util_est;
+}
+
 static inline unsigned long task_util(struct task_struct *p)
 {
 	return p->se.avg.util_avg;
@@ -6322,16 +6357,43 @@ static inline unsigned long task_util_est(struct task_struct *p)
  */
 static unsigned long cpu_util_wake(int cpu, struct task_struct *p)
 {
-	unsigned long util, capacity;
+	long util, util_est;
 
 	/* Task has no contribution or is new */
 	if (cpu != task_cpu(p) || !p->se.avg.last_update_time)
-		return cpu_util(cpu);
+		return cpu_util_est(cpu);
 
-	capacity = capacity_orig_of(cpu);
-	util = max_t(long, cpu_rq(cpu)->cfs.avg.util_avg - task_util(p), 0);
+	/* Discount task's blocked util from CPU's util */
+	util = cpu_util(cpu) - task_util(p);
+	util = max(util, 0L);
 
-	return (util >= capacity) ? capacity : util;
+	if (!sched_feat(UTIL_EST))
+		return util;
+
+	/*
+	 * These are the main cases covered:
+	 * - if *p is the only task sleeping on this CPU, then:
+	 *      cpu_util (== task_util) > util_est (== 0)
+	 *   and thus we return:
+	 *      cpu_util_wake = (cpu_util - task_util) = 0
+	 *
+	 * - if other tasks are SLEEPING on the same CPU, which is just waking
+	 *   up, then:
+	 *      cpu_util >= task_util
+	 *      cpu_util > util_est (== 0)
+	 *   and thus we discount *p's blocked utilization to return:
+	 *      cpu_util_wake = (cpu_util - task_util) >= 0
+	 *
+	 * - if other tasks are RUNNABLE on that CPU and
+	 *      util_est > cpu_util
+	 *   then we use util_est since it returns a more restrictive
+	 *   estimation of the spare capacity on that CPU, by just considering
+	 *   the expected utilization of tasks already runnable on that CPU.
+	 */
+	util_est = cpu_rq(cpu)->cfs.util_est_runnable;
+	util = max(util, util_est);
+
+	return util;
 }
 
 /*
@@ -7857,7 +7919,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 			load = source_load(i, load_idx);
 
 		sgs->group_load += load;
-		sgs->group_util += cpu_util(i);
+		sgs->group_util += cpu_util_est(i);
 		sgs->sum_nr_running += rq->cfs.h_nr_running;
 
 		nr_running = rq->nr_running;
-- 
2.14.1

  parent reply	other threads:[~2017-12-05 17:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-05 17:10 [PATCH v2 0/4] Utilization estimation (util_est) for FAIR tasks Patrick Bellasi
2017-12-05 17:10 ` [PATCH v2 1/4] sched/fair: always used unsigned long for utilization Patrick Bellasi
2017-12-06  8:56   ` Vincent Guittot
2018-01-10 12:14   ` [tip:sched/core] sched/fair: Use 'unsigned long' for utilization, consistently tip-bot for Patrick Bellasi
2017-12-05 17:10 ` [PATCH v2 2/4] sched/fair: add util_est on top of PELT Patrick Bellasi
2017-12-13 16:05   ` Peter Zijlstra
2017-12-15 14:02     ` Patrick Bellasi
2017-12-15 14:07       ` Peter Zijlstra
2017-12-15 15:22         ` Patrick Bellasi
2017-12-13 16:16   ` Peter Zijlstra
2017-12-15 12:14     ` Patrick Bellasi
2017-12-15 12:53       ` Peter Zijlstra
2017-12-15 15:41         ` Patrick Bellasi
2017-12-20  8:57           ` Peter Zijlstra
2017-12-20  9:02             ` Peter Zijlstra
2017-12-13 16:19   ` Peter Zijlstra
2017-12-13 16:36     ` Patrick Bellasi
2017-12-13 17:03       ` Peter Zijlstra
2017-12-15 12:03         ` Patrick Bellasi
2017-12-15 12:58           ` Peter Zijlstra
2017-12-05 17:10 ` Patrick Bellasi [this message]
2017-12-05 17:10 ` [PATCH v2 4/4] sched/cpufreq_schedutil: use util_est for OPP selection Patrick Bellasi
2017-12-16  2:35   ` Rafael J. Wysocki
2017-12-18 10:48     ` Patrick Bellasi
2017-12-13 16:03 ` [PATCH v2 0/4] Utilization estimation (util_est) for FAIR tasks Peter Zijlstra
2017-12-13 16:23   ` Patrick Bellasi
2017-12-13 17:56 ` Mike Galbraith
2017-12-15 16:13   ` Patrick Bellasi
2017-12-15 20:23     ` Mike Galbraith
2017-12-16  6:37       ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171205171018.9203-4-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=tkjos@android.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.