linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] sched/cfs: make util/load_avg more stable
@ 2017-04-26  6:27 Vincent Guittot
  2017-05-15  8:55 ` [tip:sched/core] sched/cfs: Make " tip-bot for Vincent Guittot
  0 siblings, 1 reply; 2+ messages in thread
From: Vincent Guittot @ 2017-04-26  6:27 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: dietmar.eggemann, Morten.Rasmussen, yuyang.du, pjt, bsegall,
	Vincent Guittot

In the current implementation of load/util_avg, we assume that the ongoing
time segment has fully elapsed, and util/load_sum is divided by LOAD_AVG_MAX,
even if part of the time segment still remains to run. As a consequence, this
remaining part is considered as idle time and generates unexpected variations
of util_avg of a busy CPU in the range [1002..1024[ whereas util_avg should
stay at 1023.

In order to keep the metric stable, we should not consider the ongoing time
segment when computing load/util_avg but only the segments that have already
fully elapsed. But to not consider the current time segment adds unwanted
latency in the load/util_avg responsivness especially when the time is scaled
instead of the contribution. Instead of waiting for the current time segment
to have fully elapsed before accounting it in load/util_avg, we can already
account the elapsed part but change the range used to compute load/util_avg
accordingly.

At the very beginning of a new time segment, the past segments have been
decayed and the max value is LOAD_AVG_MAX*y. At the very end of the current
time segment, the max value becomes 1024(us) + LOAD_AVG_MAX*y which is equal
to LOAD_AVG_MAX. In fact, the max value is
sa->period_contrib + LOAD_AVG_MAX*y at any time in the time segment.

Taking advantage of the fact that LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024, the
range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib].

As the elapsed part is already accounted in load/util_sum, we update the max
value according to the current position in the time segment instead of
removing its contribution.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---

Changes:
-Correct typo in commit message: s/MAX_LOAD_AVG/LOAD_AVG_MAX/ and square bracket

 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a903276..3531fa1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 	/*
 	 * Step 2: update *_avg.
 	 */
-	sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+	sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
 	if (cfs_rq) {
 		cfs_rq->runnable_load_avg =
-			div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX);
+			div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
 	}
-	sa->util_avg = sa->util_sum / LOAD_AVG_MAX;
+	sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib);
 
 	return 1;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [tip:sched/core] sched/cfs: Make util/load_avg more stable
  2017-04-26  6:27 [PATCH v3] sched/cfs: make util/load_avg more stable Vincent Guittot
@ 2017-05-15  8:55 ` tip-bot for Vincent Guittot
  0 siblings, 0 replies; 2+ messages in thread
From: tip-bot for Vincent Guittot @ 2017-05-15  8:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, peterz, vincent.guittot, hpa, torvalds, efault, tglx,
	linux-kernel

Commit-ID:  625ed2bf049d5a352c1bcca962d6e133454eaaff
Gitweb:     http://git.kernel.org/tip/625ed2bf049d5a352c1bcca962d6e133454eaaff
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Wed, 26 Apr 2017 08:27:56 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 15 May 2017 10:15:13 +0200

sched/cfs: Make util/load_avg more stable

In the current implementation of load/util_avg, we assume that the
ongoing time segment has fully elapsed, and util/load_sum is divided
by LOAD_AVG_MAX, even if part of the time segment still remains to
run. As a consequence, this remaining part is considered as idle time
and generates unexpected variations of util_avg of a busy CPU in the
range [1002..1024[ whereas util_avg should stay at 1023.

In order to keep the metric stable, we should not consider the ongoing
time segment when computing load/util_avg but only the segments that
have already fully elapsed. But to not consider the current time
segment adds unwanted latency in the load/util_avg responsivness
especially when the time is scaled instead of the contribution.

Instead of waiting for the current time segment to have fully elapsed
before accounting it in load/util_avg, we can already account the
elapsed part but change the range used to compute load/util_avg
accordingly.

At the very beginning of a new time segment, the past segments have
been decayed and the max value is LOAD_AVG_MAX*y. At the very end of
the current time segment, the max value becomes:

  LOAD_AVG_MAX*y + 1024(us)  (== LOAD_AVG_MAX)

In fact, the max value is:

  LOAD_AVG_MAX*y + sa->period_contrib

at any time in the time segment.

Taking advantage of the fact that:

  LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024

the range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib].

As the elapsed part is already accounted in load/util_sum, we update
the max value according to the current position in the time segment
instead of removing its contribution.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Morten.Rasmussen@arm.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: pjt@google.com
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1493188076-2767-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d711093..4f1825d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 	/*
 	 * Step 2: update *_avg.
 	 */
-	sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+	sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
 	if (cfs_rq) {
 		cfs_rq->runnable_load_avg =
-			div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX);
+			div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
 	}
-	sa->util_avg = sa->util_sum / LOAD_AVG_MAX;
+	sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib);
 
 	return 1;
 }

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-05-15  9:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-26  6:27 [PATCH v3] sched/cfs: make util/load_avg more stable Vincent Guittot
2017-05-15  8:55 ` [tip:sched/core] sched/cfs: Make " tip-bot for Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).