linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: linux-kernel@vger.kernel.org
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Phil Auld <pauld@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Odin Ugedal <odin@uged.al>, Rik van Riel <riel@surriel.com>,
	Giovanni Gherdovich <ggherdovich@suse.cz>
Subject: [RFC PATCH v2 4/5] sched/fair: Simplify load_cfs_rq_list maintenance
Date: Thu, 19 Aug 2021 19:50:33 +0200	[thread overview]
Message-ID: <20210819175034.4577-5-mkoutny@suse.com> (raw)
In-Reply-To: <20210819175034.4577-1-mkoutny@suse.com>

load_cfs_rq_list should contain cfs_rqs that have runnable entities in
them.  When we're operating under a throttled hierarchy we always
update the load_cfs_rq_list in order not to break list_add_load_cfs_rq
invariant of adding complete branches.

This caused troubles when an entity became runnable (enqueue_entity)
under a throttled hierarchy (see commit b34cb07dde7c ("sched/fair: Fix
enqueue_task_fair() warning some more")). (Basically when we add to the
load list, we have to ensure all the ancestors are added and when
deleting we have to delete whole subtree.)

This patch simplifies the code by no updates of load_cfs_rq_list when
we're operating under the throttled hierarchy and defers the
load_cfs_rq_list update to the point when whole hierarchy is unthrottled
(tg_unthrottle_up). Specifically, subtrees of a throttled cfs_rq are not
decaying their PELT when they're being throttled (but the parent of the
throttled cfs_rq is decayed).

The code is now simpler and load_cfs_rq_list contains only cfs_rqs that
have load to decay.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/sched/fair.c | 58 ++++++++++-----------------------------------
 1 file changed, 12 insertions(+), 46 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6f4d5d4dcdd9..9978485334ec 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3086,6 +3086,8 @@ void reweight_task(struct task_struct *p, int prio)
 	load->inv_weight = sched_prio_to_wmult[prio];
 }
 
+static inline int throttled_hierarchy(struct cfs_rq *cfs_rq);
+
 #ifdef CONFIG_FAIR_GROUP_SCHED
 #ifdef CONFIG_SMP
 /*
@@ -3196,8 +3198,6 @@ static long calc_group_shares(struct cfs_rq *cfs_rq)
 }
 #endif /* CONFIG_SMP */
 
-static inline int throttled_hierarchy(struct cfs_rq *cfs_rq);
-
 /*
  * Recomputes the group entity based on the current state of its group
  * runqueue.
@@ -4294,10 +4294,11 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 
 	/*
 	 * When bandwidth control is enabled, cfs might have been removed
-	 * because of a parent been throttled but cfs->nr_running > 1. Try to
-	 * add it unconditionally.
+	 * because of a parent been throttled. We'll add it later (with
+	 * complete branch up to se->on_rq/cfs_eq->on_list) in
+	 * tg_unthrottle_up() and unthrottle_cfs_rq().
 	 */
-	if (cfs_rq->nr_running == 1 || cfs_bandwidth_used())
+	if (cfs_rq->nr_running == 1 && !throttled_hierarchy(cfs_rq))
 		list_add_load_cfs_rq(cfs_rq);
 
 	if (cfs_rq->nr_running == 1)
@@ -4936,31 +4937,13 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 		/* end evaluation on encountering a throttled cfs_rq */
 		if (cfs_rq_throttled(cfs_rq))
 			goto unthrottle_throttle;
-
-		/*
-		 * One parent has been throttled and cfs_rq removed from the
-		 * list. Add it back to not break the load list.
-		 */
-		if (throttled_hierarchy(cfs_rq))
-			list_add_load_cfs_rq(cfs_rq);
 	}
 
 	/* At this point se is NULL and we are at root level*/
 	add_nr_running(rq, task_delta);
 
 unthrottle_throttle:
-	/*
-	 * The cfs_rq_throttled() breaks in the above iteration can result in
-	 * incomplete load list maintenance, resulting in triggering the
-	 * assertion below.
-	 */
-	for_each_sched_entity(se) {
-		cfs_rq = cfs_rq_of(se);
-
-		if (list_add_load_cfs_rq(cfs_rq))
-			break;
-	}
-
+	/* See enqueue_task_fair:enqueue_throttle */
 	assert_list_load_cfs_rq(rq);
 
 	/* Determine whether we need to wake up potentially idle CPU: */
@@ -5600,13 +5583,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		/* end evaluation on encountering a throttled cfs_rq */
 		if (cfs_rq_throttled(cfs_rq))
 			goto enqueue_throttle;
-
-		/*
-		 * One parent has been throttled and cfs_rq removed from the
-		 * list. Add it back to not break the load list.
-		 */
-		if (throttled_hierarchy(cfs_rq))
-			list_add_load_cfs_rq(cfs_rq);
 	}
 
 	/* At this point se is NULL and we are at root level*/
@@ -5630,21 +5606,11 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		update_overutilized_status(rq);
 
 enqueue_throttle:
-	if (cfs_bandwidth_used()) {
-		/*
-		 * When bandwidth control is enabled; the cfs_rq_throttled()
-		 * breaks in the above iteration can result in incomplete
-		 * load list maintenance, resulting in triggering the assertion
-		 * below.
-		 */
-		for_each_sched_entity(se) {
-			cfs_rq = cfs_rq_of(se);
-
-			if (list_add_load_cfs_rq(cfs_rq))
-				break;
-		}
-	}
-
+	/*
+	 * If we got here, subtree of a cfs_rq must have been throttled and
+	 * therefore we did not modify load list or we climbed up to root (or
+	 * joined to an ancestor cfs_rq with on_rq == 1 => on_list).
+	 */
 	assert_list_load_cfs_rq(rq);
 
 	hrtick_update(rq);
-- 
2.32.0


  parent reply	other threads:[~2021-08-19 17:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19 17:50 [RFC PATCH v2 0/5] leaf_cfs_rq_list cleanups and fix Michal Koutný
2021-08-19 17:50 ` [RFC PATCH v2 1/5] sched/fair: Add ancestors of unthrottled undecayed cfs_rq Michal Koutný
2021-09-09 13:57   ` Vincent Guittot
2021-09-10 11:35     ` Michal Koutný
2021-09-10 13:17       ` Vincent Guittot
2021-08-19 17:50 ` [RFC PATCH v2 2/5] sched: Add group_se() helper Michal Koutný
2021-08-19 17:50 ` [RFC PATCH v2 3/5] sched/fair: Rename leaf_list to more fitting load_list Michal Koutný
2021-09-21 19:06   ` Odin Ugedal
2021-08-19 17:50 ` Michal Koutný [this message]
2021-09-10 14:19   ` [RFC PATCH v2 4/5] sched/fair: Simplify load_cfs_rq_list maintenance Vincent Guittot
2021-09-14  9:22     ` Michal Koutný
2021-09-14  9:45       ` Vincent Guittot
2021-09-15 12:30         ` Vincent Guittot
2021-09-15 18:59           ` Odin Ugedal
2021-09-21 19:21             ` Odin Ugedal
2021-09-21 21:40               ` Michal Koutný
2021-08-19 17:50 ` [RFC PATCH v2 5/5] sched/fair: Simplify ancestor enqueue loops Michal Koutný
2021-09-09 14:04   ` Vincent Guittot
2021-09-10 11:35     ` Michal Koutný
2021-09-10 11:43       ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210819175034.4577-5-mkoutny@suse.com \
    --to=mkoutny@suse.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ggherdovich@suse.cz \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=odin@uged.al \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).