linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
@ 2020-05-13 13:55 Vincent Guittot
  2020-05-13 18:25 ` bsegall
  2020-05-19 18:44 ` [tip: sched/urgent] sched/fair: Fix unthrottle_cfs_rq() " tip-bot2 for Vincent Guittot
  0 siblings, 2 replies; 13+ messages in thread
From: Vincent Guittot @ 2020-05-13 13:55 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel
  Cc: pauld, ouwen210, pkondeti, Vincent Guittot

Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair()
are quite close and follow the same sequence for enqueuing an entity in the
cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as
enqueue_task_fair(). This fixes a problem already faced with the latter and
add an optimization in the last for_each_sched_entity loop.

Reported-by Tao Zhou <zohooouoto@zoho.com.cn>
Reviewed-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---

v3 changes:
  - remove the unused enqueue variable

 kernel/sched/fair.c | 42 ++++++++++++++++++++++++++++++------------
 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4e12ba882663..9a58874ef104 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4792,7 +4792,6 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	struct rq *rq = rq_of(cfs_rq);
 	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
 	struct sched_entity *se;
-	int enqueue = 1;
 	long task_delta, idle_task_delta;
 
 	se = cfs_rq->tg->se[cpu_of(rq)];
@@ -4816,26 +4815,44 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	idle_task_delta = cfs_rq->idle_h_nr_running;
 	for_each_sched_entity(se) {
 		if (se->on_rq)
-			enqueue = 0;
+			break;
+		cfs_rq = cfs_rq_of(se);
+		enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
 
+		cfs_rq->h_nr_running += task_delta;
+		cfs_rq->idle_h_nr_running += idle_task_delta;
+
+		/* end evaluation on encountering a throttled cfs_rq */
+		if (cfs_rq_throttled(cfs_rq))
+			goto unthrottle_throttle;
+	}
+
+	for_each_sched_entity(se) {
 		cfs_rq = cfs_rq_of(se);
-		if (enqueue) {
-			enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
-		} else {
-			update_load_avg(cfs_rq, se, 0);
-			se_update_runnable(se);
-		}
+
+		update_load_avg(cfs_rq, se, UPDATE_TG);
+		se_update_runnable(se);
 
 		cfs_rq->h_nr_running += task_delta;
 		cfs_rq->idle_h_nr_running += idle_task_delta;
 
+
+		/* end evaluation on encountering a throttled cfs_rq */
 		if (cfs_rq_throttled(cfs_rq))
-			break;
+			goto unthrottle_throttle;
+
+		/*
+		 * One parent has been throttled and cfs_rq removed from the
+		 * list. Add it back to not break the leaf list.
+		 */
+		if (throttled_hierarchy(cfs_rq))
+			list_add_leaf_cfs_rq(cfs_rq);
 	}
 
-	if (!se)
-		add_nr_running(rq, task_delta);
+	/* At this point se is NULL and we are at root level*/
+	add_nr_running(rq, task_delta);
 
+unthrottle_throttle:
 	/*
 	 * The cfs_rq_throttled() breaks in the above iteration can result in
 	 * incomplete leaf list maintenance, resulting in triggering the
@@ -4844,7 +4861,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	for_each_sched_entity(se) {
 		cfs_rq = cfs_rq_of(se);
 
-		list_add_leaf_cfs_rq(cfs_rq);
+		if (list_add_leaf_cfs_rq(cfs_rq))
+			break;
 	}
 
 	assert_list_leaf_cfs_rq(rq);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
@ 2020-11-18 22:56 Guilherme G. Piccoli
  2020-11-18 23:30 ` Tao Zhou
  2020-11-18 23:50 ` Tao Zhou
  0 siblings, 2 replies; 13+ messages in thread
From: Guilherme G. Piccoli @ 2020-11-18 22:56 UTC (permalink / raw)
  To: vincent.guittot
  Cc: bsegall, dietmar.eggemann, juri.lelli, zohooouoto, mgorman,
	mingo, ouwen210, pauld, peterz, pkondeti, rostedt, Jay Vosburgh,
	Gavin Guo, halves, nivedita.singhvi, linux-kernel, gpiccoli

Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
we experienced a similar condition to what this patch addresses; it's an
older kernel (4.15.x) but when suggesting the users to move to an
updated 5.4.x kernel, we noticed that this patch is not there, although
similar ones are (like [0] and [1]).

So, I'd like to ask if there's any particular reason to not backport
this fix to stable kernels, specially the longterm 5.4. The main reason
behind the question is that the code is very complex for non-experienced
scheduler developers, and I'm afraid in suggesting such backport to 5.4
and introduce complex-to-debug issues.

Let me know your thoughts Vincent (and all CCed), thanks in advance.
Cheers,


Guilherme


P.S. For those that deleted this thread from the email client, here's a
link:
https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/


[0]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb

[1]
https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
<- great thread BTW!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-06-24 12:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13 13:55 [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list Vincent Guittot
2020-05-13 18:25 ` bsegall
2020-05-19 18:44 ` [tip: sched/urgent] sched/fair: Fix unthrottle_cfs_rq() " tip-bot2 for Vincent Guittot
2020-11-18 22:56 [PATCH v3] sched/fair: fix unthrottle_cfs_rq " Guilherme G. Piccoli
2020-11-18 23:30 ` Tao Zhou
2020-11-18 23:50 ` Tao Zhou
2020-11-19  0:33   ` Tao Zhou
2020-11-19  8:36     ` Vincent Guittot
2020-11-19 11:34       ` Guilherme G. Piccoli
2020-11-19 13:25         ` Vincent Guittot
2020-11-19 14:07           ` Guilherme Piccoli
2021-06-24 10:29           ` Po-Hsu Lin
2021-06-24 12:31             ` Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).