linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Paul Turner <pjt@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@android.com>,
	Joel Fernandes <joelaf@google.com>,
	Steve Muckle <smuckle@google.com>
Subject: [PATCH v5 4/4] sched/fair: update util_est only on util_avg updates
Date: Thu, 22 Feb 2018 17:01:53 +0000	[thread overview]
Message-ID: <20180222170153.673-5-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20180222170153.673-1-patrick.bellasi@arm.com>

The estimated utilization of a task is currently updated every time the
task is dequeued. However, to keep overheads under control, PELT signals
are effectively updated at maximum once every 1ms.

Thus, for really short running tasks, it can happen that their util_avg
value has not been updates since their last enqueue.  If such tasks are
also frequently running tasks (e.g. the kind of workload generated by
hackbench) it can also happen that their util_avg is updated only every
few activations.

This means that updating util_est at every dequeue potentially introduces
not necessary overheads and it's also conceptually wrong if the util_avg
signal has never been updated during a task activation.

Let's introduce a throttling mechanism on task's util_est updates
to sync them with util_avg updates. To make the solution memory
efficient, both in terms of space and load/store operations, we encode a
synchronization flag into the LSB of util_est.enqueued.
This makes util_est an even values only metric, which is still
considered good enough for its purpose.
The synchronization bit is (re)set by __update_load_avg_se() once the
PELT signal of a task has been updated during its last activation.

Such a throttling mechanism allows to keep under control util_est
overheads in the wakeup hot path, thus making it a suitable mechanism
which can be enabled also on high-intensity workload systems.
Thus, this now switches on by default the estimation utilization
scheduler feature.

Suggested-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>

---
Changes in v5:
 - set SCHED_FEAT(UTIL_EST, true) as default (Peter)
---
 kernel/sched/fair.c     | 39 +++++++++++++++++++++++++++++++++++----
 kernel/sched/features.h |  2 +-
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8364771f7301..1bf9a86ebc39 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3047,6 +3047,29 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
 	}
 }
 
+/*
+ * When a task is dequeued, its estimated utilization should not be update if
+ * its util_avg has not been updated at least once.
+ * This flag is used to synchronize util_avg updates with util_est updates.
+ * We map this information into the LSB bit of the utilization saved at
+ * dequeue time (i.e. util_est.dequeued).
+ */
+#define UTIL_EST_NEED_UPDATE_FLAG 0x1
+
+static inline void cfs_se_util_change(struct sched_avg *avg)
+{
+	if (sched_feat(UTIL_EST)) {
+		struct util_est ue = READ_ONCE(avg->util_est);
+
+		if (!(ue.enqueued & UTIL_EST_NEED_UPDATE_FLAG))
+			return;
+
+		/* Reset flag to report util_avg has been updated */
+		ue.enqueued &= ~UTIL_EST_NEED_UPDATE_FLAG;
+		WRITE_ONCE(avg->util_est, ue);
+	}
+}
+
 #ifdef CONFIG_SMP
 /*
  * Approximate:
@@ -3308,6 +3331,7 @@ __update_load_avg_se(u64 now, int cpu, struct cfs_rq *cfs_rq, struct sched_entit
 				cfs_rq->curr == se)) {
 
 		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
+		cfs_se_util_change(&se->avg);
 		return 1;
 	}
 
@@ -5218,7 +5242,7 @@ static inline void util_est_enqueue(struct cfs_rq *cfs_rq,
 
 	/* Update root cfs_rq's estimated utilization */
 	enqueued  = READ_ONCE(cfs_rq->avg.util_est.enqueued);
-	enqueued += _task_util_est(p);
+	enqueued += (_task_util_est(p) | 0x1);
 	WRITE_ONCE(cfs_rq->avg.util_est.enqueued, enqueued);
 }
 
@@ -5310,7 +5334,7 @@ static inline void util_est_dequeue(struct cfs_rq *cfs_rq,
 	if (cfs_rq->nr_running) {
 		ue.enqueued  = READ_ONCE(cfs_rq->avg.util_est.enqueued);
 		ue.enqueued -= min_t(unsigned int, ue.enqueued,
-				     _task_util_est(p));
+				     (_task_util_est(p) | UTIL_EST_NEED_UPDATE_FLAG));
 	}
 	WRITE_ONCE(cfs_rq->avg.util_est.enqueued, ue.enqueued);
 
@@ -5321,12 +5345,19 @@ static inline void util_est_dequeue(struct cfs_rq *cfs_rq,
 	if (!task_sleep)
 		return;
 
+	/*
+	 * Skip update of task's estimated utilization if the PELT signal has
+	 * never been updated (at least once) since last enqueue time.
+	 */
+	ue = READ_ONCE(p->se.avg.util_est);
+	if (ue.enqueued & UTIL_EST_NEED_UPDATE_FLAG)
+		return;
+
 	/*
 	 * Skip update of task's estimated utilization when its EWMA is
 	 * already ~1% close to its last activation value.
 	 */
-	ue = READ_ONCE(p->se.avg.util_est);
-	ue.enqueued = task_util(p);
+	ue.enqueued = (task_util(p) | UTIL_EST_NEED_UPDATE_FLAG);
 	last_ewma_diff = ue.enqueued - ue.ewma;
 	if (within_margin(last_ewma_diff, (SCHED_CAPACITY_SCALE / 100)))
 		return;
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index c459a4b61544..85ae8488039c 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -89,4 +89,4 @@ SCHED_FEAT(WA_BIAS, true)
 /*
  * UtilEstimation. Use estimated CPU utilization.
  */
-SCHED_FEAT(UTIL_EST, false)
+SCHED_FEAT(UTIL_EST, true)
-- 
2.15.1

  parent reply	other threads:[~2018-02-22 17:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-22 17:01 [PATCH v5 0/4] Utilization estimation (util_est) for FAIR tasks Patrick Bellasi
2018-02-22 17:01 ` [PATCH v5 1/4] sched/fair: add util_est on top of PELT Patrick Bellasi
2018-03-01 17:42   ` Patrick Bellasi
2018-03-06 18:56   ` Peter Zijlstra
2018-03-07 12:32     ` Patrick Bellasi
2018-03-06 18:58   ` Peter Zijlstra
2018-03-07  9:39     ` Peter Zijlstra
2018-03-07 15:37       ` Patrick Bellasi
2018-03-07 11:31     ` Patrick Bellasi
2018-03-07 12:24       ` Peter Zijlstra
2018-03-07 15:24         ` Patrick Bellasi
2018-03-07 17:35           ` Peter Zijlstra
2018-03-06 19:02   ` Peter Zijlstra
2018-03-07 11:47     ` Patrick Bellasi
2018-03-07 12:26       ` Peter Zijlstra
2018-03-07 15:16         ` Patrick Bellasi
2018-02-22 17:01 ` [PATCH v5 2/4] sched/fair: use util_est in LB and WU paths Patrick Bellasi
2018-02-22 17:01 ` [PATCH v5 3/4] sched/cpufreq_schedutil: use util_est for OPP selection Patrick Bellasi
2018-02-26  4:04   ` Viresh Kumar
2018-03-07 10:12   ` Peter Zijlstra
2018-02-22 17:01 ` Patrick Bellasi [this message]
2018-03-01 17:46   ` [PATCH v5 4/4] sched/fair: update util_est only on util_avg updates Patrick Bellasi
2018-03-07 10:38   ` Peter Zijlstra
2018-03-08  9:15     ` Peter Zijlstra
2018-03-08  9:48   ` Peter Zijlstra
2018-03-08 10:37     ` Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180222170153.673-5-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=smuckle@google.com \
    --cc=tkjos@android.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).