All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
	linux-kernel@vger.kernel.org, rickyiu@google.com, odin@uged.al
Cc: sachinp@linux.vnet.ibm.com, naresh.kamboju@linaro.org,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: [PATCH 3 1/4] sched/pelt: Relax the sync of util_sum with util_avg
Date: Tue, 11 Jan 2022 14:46:56 +0100	[thread overview]
Message-ID: <20220111134659.24961-2-vincent.guittot@linaro.org> (raw)
In-Reply-To: <20220111134659.24961-1-vincent.guittot@linaro.org>

Rick reported performance regressions in bugzilla because of cpu frequency
being lower than before:
    https://bugzilla.kernel.org/show_bug.cgi?id=215045

He bisected the problem to:
commit 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")

This commit forces util_sum to be synced with the new util_avg after
removing the contribution of a task and before the next periodic sync. By
doing so util_sum is rounded to its lower bound and might lost up to
LOAD_AVG_MAX-1 of accumulated contribution which has not yet been
reflected in util_avg.

Instead of always setting util_sum to the low bound of util_avg, which can
significantly lower the utilization of root cfs_rq after propagating the
change down into the hierarchy, we revert the change of util_sum and
propagate the difference.

In addition, we also check that cfs's util_sum always stays above the
lower bound for a given util_avg as it has been observed that
sched_entity's util_sum is sometimes above cfs one.

Reported-by: Rick Yiu <rickyiu@google.com>
Fixes: 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 095b0aa378df..ed35255fdb85 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3381,6 +3381,7 @@ void set_task_rq_fair(struct sched_entity *se,
 	se->avg.last_update_time = n_last_update_time;
 }
 
+#define MIN_DIVIDER (LOAD_AVG_MAX - 1024)
 
 /*
  * When on migration a sched_entity joins/leaves the PELT hierarchy, we need to
@@ -3449,7 +3450,6 @@ void set_task_rq_fair(struct sched_entity *se,
  * XXX: only do this for the part of runnable > running ?
  *
  */
-
 static inline void
 update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
 {
@@ -3681,7 +3681,19 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
 
 		r = removed_util;
 		sub_positive(&sa->util_avg, r);
-		sa->util_sum = sa->util_avg * divider;
+		sub_positive(&sa->util_sum, r * divider);
+		/*
+		 * Because of rounding, se->util_sum might ends up being +1 more than
+		 * cfs->util_sum. Although this is not a problem by itself, detaching
+		 * a lot of tasks with the rounding problem between 2 updates of
+		 * util_avg (~1ms) can make cfs->util_sum becoming null whereas
+		 * cfs_util_avg is not.
+		 * Check that util_sum is still above its lower bound for the new
+		 * util_avg. Given that period_contrib might have moved since the last
+		 * sync, we are only sure that util_sum must be above or equal to
+		 *    util_avg * minimum possible divider
+		 */
+		sa->util_sum = max_t(u32, sa->util_sum, sa->util_avg * MIN_DIVIDER);
 
 		r = removed_runnable;
 		sub_positive(&sa->runnable_avg, r);
-- 
2.17.1


  reply	other threads:[~2022-01-11 13:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-11 13:46 [PATCH v3 0/4] sched/pelt: Relax the sync of *_sum with *_avg Vincent Guittot
2022-01-11 13:46 ` Vincent Guittot [this message]
2022-01-12 15:26   ` [PATCH 3 1/4] sched/pelt: Relax the sync of util_sum with util_avg Dietmar Eggemann
2022-01-12 16:04     ` Vincent Guittot
2022-01-18 11:18   ` [tip: sched/urgent] " tip-bot2 for Vincent Guittot
2022-01-11 13:46 ` [PATCH v3 2/4] sched/pelt: Continue to relax " Vincent Guittot
2022-01-18 11:18   ` [tip: sched/urgent] " tip-bot2 for Vincent Guittot
2022-01-11 13:46 ` [PATCH v3 3/4] sched/pelt: Relax the sync of runnable_sum with runnable_avg Vincent Guittot
2022-01-18 11:18   ` [tip: sched/urgent] " tip-bot2 for Vincent Guittot
2022-01-11 13:46 ` [PATCH v3 4/4] sched/pelt: Relax the sync of load_sum with load_avg Vincent Guittot
2022-01-18 11:18   ` [tip: sched/urgent] " tip-bot2 for Vincent Guittot
2022-01-12 11:23 ` [PATCH v3 0/4] sched/pelt: Relax the sync of *_sum with *_avg Sachin Sant
2022-01-12 13:17   ` Vincent Guittot
2022-01-12 15:26 ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220111134659.24961-2-vincent.guittot@linaro.org \
    --to=vincent.guittot@linaro.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=naresh.kamboju@linaro.org \
    --cc=odin@uged.al \
    --cc=peterz@infradead.org \
    --cc=rickyiu@google.com \
    --cc=rostedt@goodmis.org \
    --cc=sachinp@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.