All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup
@ 2022-08-18 12:47 Chengming Zhou
  2022-08-18 12:47 ` [PATCH v6 1/9] sched/fair: maintain task se depth in set_task_rq() Chengming Zhou
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:47 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

Hi all,

This patch series is optimization and cleanup for task load tracking when
task migrate CPU/cgroup or switched_from/to_fair(), based on tip/sched/core.

There are three types of detach/attach_entity_load_avg (except fork and exit)
for a fair task:
1. task migrate CPU (on_rq migrate or wake_up migrate)
2. task migrate cgroup (detach and attach)
3. task switched_from/to_fair (detach later attach)

patch 1-3 cleanup the task change cgroup case by remove cpu_cgrp_subsys->fork(),
since we already do the same thing in sched_cgroup_fork().

patch 5/9 optimize the task migrate CPU case by combine detach into dequeue.

patch 6/9 fix another detach on unattached task case which has been woken up
by try_to_wake_up() but is waiting for actually being woken up by
sched_ttwu_pending().

patch 7/9 remove unnecessary limitation that we would fail when change
cgroup of forked task which hasn't been woken up by wake_up_new_task().

patch 8-9 optimize post_init_entity_util_avg() for fair task and skip
setting util_avg and runnable_avg for !fair task at the fork time.

Thanks!


Changes in v6:
 - Use TASK_NEW to check new forked task which hasn't been woken up
   by wake_up_new_task(), suggested by Peter Zijlstra. Thanks!
 - Update comments related to post_init_entity_util_avg() in patch 8/9.

Changes in v5:
 - Don't do code movements in patch 6/9, which complicate code review,
   as suggested by Vincent. Thanks!
 - Fix a build error of typo in patch 7/9.

Changes in v4:
 - Drop detach/attach_entity_cfs_rq() refactor patch in the last version.
 - Move new forked task check to task_change_group_fair().

Changes in v3:
 - One big change is this series don't freeze PELT sum/avg values to be
   used as initial values when re-entering fair any more, since these
   PELT values become much less relevant.
 - Reorder patches and collect tags from Vincent and Dietmar. Thanks!
 - Fix detach on unattached task which has been woken up by try_to_wake_up()
   but is waiting for actually being woken up by sched_ttwu_pending().
 - Delete TASK_NEW which limit forked task from changing cgroup.
 - Don't init util_avg and runnable_avg for !fair taks at fork time.

Changes in v2:
 - split task se depth maintenance into a separate patch3, suggested
   by Peter.
 - reorder patch6-7 before patch8-9, since we need update_load_avg()
   to do conditional attach/detach to avoid corner cases like twice
   attach problem.

Chengming Zhou (9):
  sched/fair: maintain task se depth in set_task_rq()
  sched/fair: remove redundant cpu_cgrp_subsys->fork()
  sched/fair: reset sched_avg last_update_time before set_task_rq()
  sched/fair: update comments in enqueue/dequeue_entity()
  sched/fair: combine detach into dequeue when migrating task
  sched/fair: fix another detach on unattached task corner case
  sched/fair: allow changing cgroup of new forked task
  sched/fair: move task sched_avg attach to enqueue_task_fair()
  sched/fair: don't init util/runnable_avg for !fair task

 kernel/sched/core.c  |  52 ++++---------------
 kernel/sched/fair.c  | 120 ++++++++++++++++++++-----------------------
 kernel/sched/sched.h |   6 +--
 3 files changed, 66 insertions(+), 112 deletions(-)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v6 1/9] sched/fair: maintain task se depth in set_task_rq()
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
@ 2022-08-18 12:47 ` Chengming Zhou
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Maintain " tip-bot2 for Chengming Zhou
  2022-08-18 12:47 ` [PATCH v6 2/9] sched/fair: remove redundant cpu_cgrp_subsys->fork() Chengming Zhou
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:47 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

Previously we only maintain task se depth in task_move_group_fair(),
if a !fair task change task group, its se depth will not be updated,
so commit eb7a59b2c888 ("sched/fair: Reset se-depth when task switched to FAIR")
fix the problem by updating se depth in switched_to_fair() too.

Then commit daa59407b558 ("sched/fair: Unify switched_{from,to}_fair()
and task_move_group_fair()") unified these two functions, moved se.depth
setting to attach_task_cfs_rq(), which further into attach_entity_cfs_rq()
with commit df217913e72e ("sched/fair: Factorize attach/detach entity").

This patch move task se depth maintenance from attach_entity_cfs_rq()
to set_task_rq(), which will be called when CPU/cgroup change, so its
depth will always be correct.

This patch is preparation for the next patch.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c  | 8 --------
 kernel/sched/sched.h | 1 +
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a71d6686149b..c5ee08b187ec 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11726,14 +11726,6 @@ static void attach_entity_cfs_rq(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
-#ifdef CONFIG_FAIR_GROUP_SCHED
-	/*
-	 * Since the real-depth could have been changed (only FAIR
-	 * class maintain depth value), reset depth properly.
-	 */
-	se->depth = se->parent ? se->parent->depth + 1 : 0;
-#endif
-
 	/* Synchronize entity with its cfs_rq */
 	update_load_avg(cfs_rq, se, sched_feat(ATTACH_AGE_LOAD) ? 0 : SKIP_AGE_LOAD);
 	attach_entity_load_avg(cfs_rq, se);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ddcfc7837595..628ffa974123 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1932,6 +1932,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
 	set_task_rq_fair(&p->se, p->se.cfs_rq, tg->cfs_rq[cpu]);
 	p->se.cfs_rq = tg->cfs_rq[cpu];
 	p->se.parent = tg->se[cpu];
+	p->se.depth = tg->se[cpu] ? tg->se[cpu]->depth + 1 : 0;
 #endif
 
 #ifdef CONFIG_RT_GROUP_SCHED
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 2/9] sched/fair: remove redundant cpu_cgrp_subsys->fork()
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
  2022-08-18 12:47 ` [PATCH v6 1/9] sched/fair: maintain task se depth in set_task_rq() Chengming Zhou
@ 2022-08-18 12:47 ` Chengming Zhou
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Chengming Zhou
  2022-08-18 12:47 ` [PATCH v6 3/9] sched/fair: reset sched_avg last_update_time before set_task_rq() Chengming Zhou
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:47 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

We use cpu_cgrp_subsys->fork() to set task group for the new fair task
in cgroup_post_fork().

Since commit b1e8206582f9 ("sched: Fix yet more sched_fork() races")
has already set_task_rq() for the new fair task in sched_cgroup_fork(),
so cpu_cgrp_subsys->fork() can be removed.

  cgroup_can_fork()	--> pin parent's sched_task_group
  sched_cgroup_fork()
    __set_task_cpu()
      set_task_rq()
  cgroup_post_fork()
    ss->fork() := cpu_cgroup_fork()
      sched_change_group(..., TASK_SET_GROUP)
        task_set_group_fair()
          set_task_rq()  --> can be removed

After this patch's change, task_change_group_fair() only need to
care about task cgroup migration, make the code much simplier.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/core.c  | 27 ++++-----------------------
 kernel/sched/fair.c  | 23 +----------------------
 kernel/sched/sched.h |  5 +----
 3 files changed, 6 insertions(+), 49 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 863b5203e357..8e3f1c3f0b2c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -481,8 +481,7 @@ sched_core_dequeue(struct rq *rq, struct task_struct *p, int flags) { }
  *				p->se.load, p->rt_priority,
  *				p->dl.dl_{runtime, deadline, period, flags, bw, density}
  *  - sched_setnuma():		p->numa_preferred_nid
- *  - sched_move_task()/
- *    cpu_cgroup_fork():	p->sched_task_group
+ *  - sched_move_task():	p->sched_task_group
  *  - uclamp_update_active()	p->uclamp*
  *
  * p->state <- TASK_*:
@@ -10166,7 +10165,7 @@ void sched_release_group(struct task_group *tg)
 	spin_unlock_irqrestore(&task_group_lock, flags);
 }
 
-static void sched_change_group(struct task_struct *tsk, int type)
+static void sched_change_group(struct task_struct *tsk)
 {
 	struct task_group *tg;
 
@@ -10182,7 +10181,7 @@ static void sched_change_group(struct task_struct *tsk, int type)
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	if (tsk->sched_class->task_change_group)
-		tsk->sched_class->task_change_group(tsk, type);
+		tsk->sched_class->task_change_group(tsk);
 	else
 #endif
 		set_task_rq(tsk, task_cpu(tsk));
@@ -10213,7 +10212,7 @@ void sched_move_task(struct task_struct *tsk)
 	if (running)
 		put_prev_task(rq, tsk);
 
-	sched_change_group(tsk, TASK_MOVE_GROUP);
+	sched_change_group(tsk);
 
 	if (queued)
 		enqueue_task(rq, tsk, queue_flags);
@@ -10291,23 +10290,6 @@ static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
 	sched_unregister_group(tg);
 }
 
-/*
- * This is called before wake_up_new_task(), therefore we really only
- * have to set its group bits, all the other stuff does not apply.
- */
-static void cpu_cgroup_fork(struct task_struct *task)
-{
-	struct rq_flags rf;
-	struct rq *rq;
-
-	rq = task_rq_lock(task, &rf);
-
-	update_rq_clock(rq);
-	sched_change_group(task, TASK_SET_GROUP);
-
-	task_rq_unlock(rq, task, &rf);
-}
-
 static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
@@ -11173,7 +11155,6 @@ struct cgroup_subsys cpu_cgrp_subsys = {
 	.css_released	= cpu_cgroup_css_released,
 	.css_free	= cpu_cgroup_css_free,
 	.css_extra_stat_show = cpu_extra_stat_show,
-	.fork		= cpu_cgroup_fork,
 	.can_attach	= cpu_cgroup_can_attach,
 	.attach		= cpu_cgroup_attach,
 	.legacy_cftypes	= cpu_legacy_files,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c5ee08b187ec..4b95599aa951 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11821,15 +11821,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 }
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
-static void task_set_group_fair(struct task_struct *p)
-{
-	struct sched_entity *se = &p->se;
-
-	set_task_rq(p, task_cpu(p));
-	se->depth = se->parent ? se->parent->depth + 1 : 0;
-}
-
-static void task_move_group_fair(struct task_struct *p)
+static void task_change_group_fair(struct task_struct *p)
 {
 	detach_task_cfs_rq(p);
 	set_task_rq(p, task_cpu(p));
@@ -11841,19 +11833,6 @@ static void task_move_group_fair(struct task_struct *p)
 	attach_task_cfs_rq(p);
 }
 
-static void task_change_group_fair(struct task_struct *p, int type)
-{
-	switch (type) {
-	case TASK_SET_GROUP:
-		task_set_group_fair(p);
-		break;
-
-	case TASK_MOVE_GROUP:
-		task_move_group_fair(p);
-		break;
-	}
-}
-
 void free_fair_sched_group(struct task_group *tg)
 {
 	int i;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 628ffa974123..2db7b0494c19 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2195,11 +2195,8 @@ struct sched_class {
 
 	void (*update_curr)(struct rq *rq);
 
-#define TASK_SET_GROUP		0
-#define TASK_MOVE_GROUP		1
-
 #ifdef CONFIG_FAIR_GROUP_SCHED
-	void (*task_change_group)(struct task_struct *p, int type);
+	void (*task_change_group)(struct task_struct *p);
 #endif
 };
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 3/9] sched/fair: reset sched_avg last_update_time before set_task_rq()
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
  2022-08-18 12:47 ` [PATCH v6 1/9] sched/fair: maintain task se depth in set_task_rq() Chengming Zhou
  2022-08-18 12:47 ` [PATCH v6 2/9] sched/fair: remove redundant cpu_cgrp_subsys->fork() Chengming Zhou
@ 2022-08-18 12:47 ` Chengming Zhou
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Reset " tip-bot2 for Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 4/9] sched/fair: update comments in enqueue/dequeue_entity() Chengming Zhou
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:47 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

set_task_rq() -> set_task_rq_fair() will try to synchronize the blocked
task's sched_avg when migrate, which is not needed for already detached
task.

task_change_group_fair() will detached the task sched_avg from prev cfs_rq
first, so reset sched_avg last_update_time before set_task_rq() to avoid that.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4b95599aa951..5a704109472a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11824,12 +11824,12 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 static void task_change_group_fair(struct task_struct *p)
 {
 	detach_task_cfs_rq(p);
-	set_task_rq(p, task_cpu(p));
 
 #ifdef CONFIG_SMP
 	/* Tell se's cfs_rq has been changed -- migrated */
 	p->se.avg.last_update_time = 0;
 #endif
+	set_task_rq(p, task_cpu(p));
 	attach_task_cfs_rq(p);
 }
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 4/9] sched/fair: update comments in enqueue/dequeue_entity()
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
                   ` (2 preceding siblings ...)
  2022-08-18 12:47 ` [PATCH v6 3/9] sched/fair: reset sched_avg last_update_time before set_task_rq() Chengming Zhou
@ 2022-08-18 12:48 ` Chengming Zhou
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Update " tip-bot2 for Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 5/9] sched/fair: combine detach into dequeue when migrating task Chengming Zhou
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:48 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

When reading the sched_avg related code, I found the comments in
enqueue/dequeue_entity() are not updated with the current code.

We don't add/subtract entity's runnable_avg from cfs_rq->runnable_avg
during enqueue/dequeue_entity(), those are done only for attach/detach.

This patch updates the comments to reflect the current code working.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5a704109472a..372e5f4a49a3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4598,7 +4598,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	/*
 	 * When enqueuing a sched_entity, we must:
 	 *   - Update loads to have both entity and cfs_rq synced with now.
-	 *   - Add its load to cfs_rq->runnable_avg
+	 *   - For group_entity, update its runnable_weight to reflect the new
+	 *     h_nr_running of its group cfs_rq.
 	 *   - For group_entity, update its weight to reflect the new share of
 	 *     its group cfs_rq
 	 *   - Add its new weight to cfs_rq->load.weight
@@ -4683,7 +4684,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	/*
 	 * When dequeuing a sched_entity, we must:
 	 *   - Update loads to have both entity and cfs_rq synced with now.
-	 *   - Subtract its load from the cfs_rq->runnable_avg.
+	 *   - For group_entity, update its runnable_weight to reflect the new
+	 *     h_nr_running of its group cfs_rq.
 	 *   - Subtract its previous weight from cfs_rq->load.weight.
 	 *   - For group entity, update its weight to reflect the new share
 	 *     of its group cfs_rq.
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 5/9] sched/fair: combine detach into dequeue when migrating task
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
                   ` (3 preceding siblings ...)
  2022-08-18 12:48 ` [PATCH v6 4/9] sched/fair: update comments in enqueue/dequeue_entity() Chengming Zhou
@ 2022-08-18 12:48 ` Chengming Zhou
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Combine " tip-bot2 for Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case Chengming Zhou
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:48 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

When we are migrating task out of the CPU, we can combine detach and
propagation into dequeue_entity() to save the detach_entity_cfs_rq()
in migrate_task_rq_fair().

This optimization is like combining DO_ATTACH in the enqueue_entity()
when migrating task to the CPU. So we don't have to traverse the CFS tree
extra time to do the detach_entity_cfs_rq() -> propagate_entity_cfs_rq(),
which wouldn't be called anymore with this patch's change.

detach_task()
  deactivate_task()
    dequeue_task_fair()
      for_each_sched_entity(se)
        dequeue_entity()
          update_load_avg() /* (1) */
            detach_entity_load_avg()

  set_task_cpu()
    migrate_task_rq_fair()
      detach_entity_cfs_rq() /* (2) */
        update_load_avg();
        detach_entity_load_avg();
        propagate_entity_cfs_rq();
          for_each_sched_entity()
            update_load_avg()

This patch save the detach_entity_cfs_rq() called in (2) by doing
the detach_entity_load_avg() for a CPU migrating task inside (1)
(the task being the first se in the loop)

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 372e5f4a49a3..1eb3fb3d95c3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4167,6 +4167,7 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 #define UPDATE_TG	0x1
 #define SKIP_AGE_LOAD	0x2
 #define DO_ATTACH	0x4
+#define DO_DETACH	0x8
 
 /* Update task and its cfs_rq load average */
 static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
@@ -4196,6 +4197,13 @@ static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 		attach_entity_load_avg(cfs_rq, se);
 		update_tg_load_avg(cfs_rq);
 
+	} else if (flags & DO_DETACH) {
+		/*
+		 * DO_DETACH means we're here from dequeue_entity()
+		 * and we are migrating task out of the CPU.
+		 */
+		detach_entity_load_avg(cfs_rq, se);
+		update_tg_load_avg(cfs_rq);
 	} else if (decayed) {
 		cfs_rq_util_change(cfs_rq, 0);
 
@@ -4456,6 +4464,7 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
 #define UPDATE_TG	0x0
 #define SKIP_AGE_LOAD	0x0
 #define DO_ATTACH	0x0
+#define DO_DETACH	0x0
 
 static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int not_used1)
 {
@@ -4676,6 +4685,11 @@ static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq);
 static void
 dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 {
+	int action = UPDATE_TG;
+
+	if (entity_is_task(se) && task_on_rq_migrating(task_of(se)))
+		action |= DO_DETACH;
+
 	/*
 	 * Update run-time statistics of the 'current'.
 	 */
@@ -4690,7 +4704,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	 *   - For group entity, update its weight to reflect the new share
 	 *     of its group cfs_rq.
 	 */
-	update_load_avg(cfs_rq, se, UPDATE_TG);
+	update_load_avg(cfs_rq, se, action);
 	se_update_runnable(se);
 
 	update_stats_dequeue_fair(cfs_rq, se, flags);
@@ -7242,8 +7256,6 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
 	return new_cpu;
 }
 
-static void detach_entity_cfs_rq(struct sched_entity *se);
-
 /*
  * Called immediately before a task is migrated to a new CPU; task_cpu(p) and
  * cfs_rq_of(p) references at time of call are still valid and identify the
@@ -7265,15 +7277,7 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
 		se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
 	}
 
-	if (p->on_rq == TASK_ON_RQ_MIGRATING) {
-		/*
-		 * In case of TASK_ON_RQ_MIGRATING we in fact hold the 'old'
-		 * rq->lock and can modify state directly.
-		 */
-		lockdep_assert_rq_held(task_rq(p));
-		detach_entity_cfs_rq(se);
-
-	} else {
+	if (!task_on_rq_migrating(p)) {
 		remove_entity_load_avg(se);
 
 		/*
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
                   ` (4 preceding siblings ...)
  2022-08-18 12:48 ` [PATCH v6 5/9] sched/fair: combine detach into dequeue when migrating task Chengming Zhou
@ 2022-08-18 12:48 ` Chengming Zhou
  2022-08-23  7:06   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Fix " tip-bot2 for Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task Chengming Zhou
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:48 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
fixed two load tracking problems for new task, including detach on
unattached new task problem.

There still left another detach on unattached task problem for the task
which has been woken up by try_to_wake_up() and waiting for actually
being woken up by sched_ttwu_pending().

try_to_wake_up(p)
  cpu = select_task_rq(p)
  if (task_cpu(p) != cpu)
    set_task_cpu(p, cpu)
      migrate_task_rq_fair()
        remove_entity_load_avg()       --> unattached
        se->avg.last_update_time = 0;
      __set_task_cpu()
  ttwu_queue(p, cpu)
    ttwu_queue_wakelist()
      __ttwu_queue_wakelist()

task_change_group_fair()
  detach_task_cfs_rq()
    detach_entity_cfs_rq()
      detach_entity_load_avg()   --> detach on unattached task
  set_task_rq()
  attach_task_cfs_rq()
    attach_entity_cfs_rq()
      attach_entity_load_avg()

The reason of this problem is similar, we should check in detach_entity_cfs_rq()
that se->avg.last_update_time != 0, before do detach_entity_load_avg().

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 kernel/sched/fair.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1eb3fb3d95c3..eba8a64f905a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11721,6 +11721,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
+#ifdef CONFIG_SMP
+	/*
+	 * In case the task sched_avg hasn't been attached:
+	 * - A forked task which hasn't been woken up by wake_up_new_task().
+	 * - A task which has been woken up by try_to_wake_up() but is
+	 *   waiting for actually being woken up by sched_ttwu_pending().
+	 */
+	if (!se->avg.last_update_time)
+		return;
+#endif
+
 	/* Catch up with the cfs_rq and remove our load when we leave */
 	update_load_avg(cfs_rq, se, 0);
 	detach_entity_load_avg(cfs_rq, se);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
                   ` (5 preceding siblings ...)
  2022-08-18 12:48 ` [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case Chengming Zhou
@ 2022-08-18 12:48 ` Chengming Zhou
  2022-08-23  7:54   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Allow " tip-bot2 for Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair() Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task Chengming Zhou
  8 siblings, 2 replies; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:48 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
introduce a TASK_NEW state and an unnessary limitation that would fail
when changing cgroup of new forked task.

Because at that time, we can't handle task_change_group_fair() for new
forked fair task which hasn't been woken up by wake_up_new_task(),
which will cause detach on an unattached task sched_avg problem.

This patch delete this unnessary limitation by adding check before do
detach or attach in task_change_group_fair().

So cpu_cgrp_subsys.can_attach() has nothing to do for fair tasks,
only define it in #ifdef CONFIG_RT_GROUP_SCHED.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 kernel/sched/core.c | 25 +++++--------------------
 kernel/sched/fair.c |  7 +++++++
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8e3f1c3f0b2c..14819bd66021 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10290,36 +10290,19 @@ static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
 	sched_unregister_group(tg);
 }
 
+#ifdef CONFIG_RT_GROUP_SCHED
 static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 	struct cgroup_subsys_state *css;
-	int ret = 0;
 
 	cgroup_taskset_for_each(task, css, tset) {
-#ifdef CONFIG_RT_GROUP_SCHED
 		if (!sched_rt_can_attach(css_tg(css), task))
 			return -EINVAL;
-#endif
-		/*
-		 * Serialize against wake_up_new_task() such that if it's
-		 * running, we're sure to observe its full state.
-		 */
-		raw_spin_lock_irq(&task->pi_lock);
-		/*
-		 * Avoid calling sched_move_task() before wake_up_new_task()
-		 * has happened. This would lead to problems with PELT, due to
-		 * move wanting to detach+attach while we're not attached yet.
-		 */
-		if (READ_ONCE(task->__state) == TASK_NEW)
-			ret = -EINVAL;
-		raw_spin_unlock_irq(&task->pi_lock);
-
-		if (ret)
-			break;
 	}
-	return ret;
+	return 0;
 }
+#endif
 
 static void cpu_cgroup_attach(struct cgroup_taskset *tset)
 {
@@ -11155,7 +11138,9 @@ struct cgroup_subsys cpu_cgrp_subsys = {
 	.css_released	= cpu_cgroup_css_released,
 	.css_free	= cpu_cgroup_css_free,
 	.css_extra_stat_show = cpu_extra_stat_show,
+#ifdef CONFIG_RT_GROUP_SCHED
 	.can_attach	= cpu_cgroup_can_attach,
+#endif
 	.attach		= cpu_cgroup_attach,
 	.legacy_cftypes	= cpu_legacy_files,
 	.dfl_cftypes	= cpu_files,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index eba8a64f905a..c319b0bd2bc1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11840,6 +11840,13 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 #ifdef CONFIG_FAIR_GROUP_SCHED
 static void task_change_group_fair(struct task_struct *p)
 {
+	/*
+	 * We couldn't detach or attach a forked task which
+	 * hasn't been woken up by wake_up_new_task().
+	 */
+	if (READ_ONCE(p->__state) == TASK_NEW)
+		return;
+
 	detach_task_cfs_rq(p);
 
 #ifdef CONFIG_SMP
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair()
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
                   ` (6 preceding siblings ...)
  2022-08-18 12:48 ` [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task Chengming Zhou
@ 2022-08-18 12:48 ` Chengming Zhou
  2022-08-23  7:48   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Move " tip-bot2 for Chengming Zhou
  2022-08-18 12:48 ` [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task Chengming Zhou
  8 siblings, 2 replies; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:48 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

When wake_up_new_task(), we use post_init_entity_util_avg() to init
util_avg/runnable_avg based on cpu's util_avg at that time, and
attach task sched_avg to cfs_rq.

Since enqueue_task_fair() -> enqueue_entity() -> update_load_avg()
loop will do attach, we can move this work to update_load_avg().

wake_up_new_task(p)
  post_init_entity_util_avg(p)
    attach_entity_cfs_rq()  --> (1)
  activate_task(rq, p)
    enqueue_task() := enqueue_task_fair()
      enqueue_entity() loop
        update_load_avg(cfs_rq, se, UPDATE_TG | DO_ATTACH)
          if (!se->avg.last_update_time && (flags & DO_ATTACH))
            attach_entity_load_avg()  --> (2)

This patch move attach from (1) to (2), update related comments too.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 kernel/sched/fair.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c319b0bd2bc1..93d7c7b110dd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -799,8 +799,6 @@ void init_entity_runnable_average(struct sched_entity *se)
 	/* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
 }
 
-static void attach_entity_cfs_rq(struct sched_entity *se);
-
 /*
  * With new tasks being created, their initial util_avgs are extrapolated
  * based on the cfs_rq's current util_avg:
@@ -863,8 +861,6 @@ void post_init_entity_util_avg(struct task_struct *p)
 		se->avg.last_update_time = cfs_rq_clock_pelt(cfs_rq);
 		return;
 	}
-
-	attach_entity_cfs_rq(se);
 }
 
 #else /* !CONFIG_SMP */
@@ -4002,8 +3998,7 @@ static void migrate_se_pelt_lag(struct sched_entity *se) {}
  * @cfs_rq: cfs_rq to update
  *
  * The cfs_rq avg is the direct sum of all its entities (blocked and runnable)
- * avg. The immediate corollary is that all (fair) tasks must be attached, see
- * post_init_entity_util_avg().
+ * avg. The immediate corollary is that all (fair) tasks must be attached.
  *
  * cfs_rq->avg is used for task_h_load() and update_cfs_share() for example.
  *
@@ -4236,8 +4231,8 @@ static void remove_entity_load_avg(struct sched_entity *se)
 
 	/*
 	 * tasks cannot exit without having gone through wake_up_new_task() ->
-	 * post_init_entity_util_avg() which will have added things to the
-	 * cfs_rq, so we can remove unconditionally.
+	 * enqueue_task_fair() which will have added things to the cfs_rq,
+	 * so we can remove unconditionally.
 	 */
 
 	sync_entity_load_avg(se);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task
  2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
                   ` (7 preceding siblings ...)
  2022-08-18 12:48 ` [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair() Chengming Zhou
@ 2022-08-18 12:48 ` Chengming Zhou
  2022-08-23  7:49   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Don't " tip-bot2 for Chengming Zhou
  8 siblings, 2 replies; 23+ messages in thread
From: Chengming Zhou @ 2022-08-18 12:48 UTC (permalink / raw)
  To: vincent.guittot, dietmar.eggemann, mingo, peterz, rostedt,
	bsegall, vschneid
  Cc: linux-kernel, tj, Chengming Zhou

post_init_entity_util_avg() init task util_avg according to the cpu util_avg
at the time of fork, which will decay when switched_to_fair() some time later,
we'd better to not set them at all in the case of !fair task.

Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 kernel/sched/fair.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 93d7c7b110dd..621bd19e10ae 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -833,20 +833,6 @@ void post_init_entity_util_avg(struct task_struct *p)
 	long cpu_scale = arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
 	long cap = (long)(cpu_scale - cfs_rq->avg.util_avg) / 2;
 
-	if (cap > 0) {
-		if (cfs_rq->avg.util_avg != 0) {
-			sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
-			sa->util_avg /= (cfs_rq->avg.load_avg + 1);
-
-			if (sa->util_avg > cap)
-				sa->util_avg = cap;
-		} else {
-			sa->util_avg = cap;
-		}
-	}
-
-	sa->runnable_avg = sa->util_avg;
-
 	if (p->sched_class != &fair_sched_class) {
 		/*
 		 * For !fair tasks do:
@@ -861,6 +847,20 @@ void post_init_entity_util_avg(struct task_struct *p)
 		se->avg.last_update_time = cfs_rq_clock_pelt(cfs_rq);
 		return;
 	}
+
+	if (cap > 0) {
+		if (cfs_rq->avg.util_avg != 0) {
+			sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
+			sa->util_avg /= (cfs_rq->avg.load_avg + 1);
+
+			if (sa->util_avg > cap)
+				sa->util_avg = cap;
+		} else {
+			sa->util_avg = cap;
+		}
+	}
+
+	sa->runnable_avg = sa->util_avg;
 }
 
 #else /* !CONFIG_SMP */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case
  2022-08-18 12:48 ` [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case Chengming Zhou
@ 2022-08-23  7:06   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Fix " tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: Vincent Guittot @ 2022-08-23  7:06 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: dietmar.eggemann, mingo, peterz, rostedt, bsegall, vschneid,
	linux-kernel, tj

On Thu, 18 Aug 2022 at 14:48, Chengming Zhou
<zhouchengming@bytedance.com> wrote:
>
> commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
> fixed two load tracking problems for new task, including detach on
> unattached new task problem.
>
> There still left another detach on unattached task problem for the task
> which has been woken up by try_to_wake_up() and waiting for actually
> being woken up by sched_ttwu_pending().
>
> try_to_wake_up(p)
>   cpu = select_task_rq(p)
>   if (task_cpu(p) != cpu)
>     set_task_cpu(p, cpu)
>       migrate_task_rq_fair()
>         remove_entity_load_avg()       --> unattached
>         se->avg.last_update_time = 0;
>       __set_task_cpu()
>   ttwu_queue(p, cpu)
>     ttwu_queue_wakelist()
>       __ttwu_queue_wakelist()
>
> task_change_group_fair()
>   detach_task_cfs_rq()
>     detach_entity_cfs_rq()
>       detach_entity_load_avg()   --> detach on unattached task
>   set_task_rq()
>   attach_task_cfs_rq()
>     attach_entity_cfs_rq()
>       attach_entity_load_avg()
>
> The reason of this problem is similar, we should check in detach_entity_cfs_rq()
> that se->avg.last_update_time != 0, before do detach_entity_load_avg().
>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/fair.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1eb3fb3d95c3..eba8a64f905a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11721,6 +11721,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
>  {
>         struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> +#ifdef CONFIG_SMP
> +       /*
> +        * In case the task sched_avg hasn't been attached:
> +        * - A forked task which hasn't been woken up by wake_up_new_task().
> +        * - A task which has been woken up by try_to_wake_up() but is
> +        *   waiting for actually being woken up by sched_ttwu_pending().
> +        */
> +       if (!se->avg.last_update_time)
> +               return;
> +#endif
> +
>         /* Catch up with the cfs_rq and remove our load when we leave */
>         update_load_avg(cfs_rq, se, 0);
>         detach_entity_load_avg(cfs_rq, se);
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair()
  2022-08-18 12:48 ` [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair() Chengming Zhou
@ 2022-08-23  7:48   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Move " tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: Vincent Guittot @ 2022-08-23  7:48 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: dietmar.eggemann, mingo, peterz, rostedt, bsegall, vschneid,
	linux-kernel, tj

On Thu, 18 Aug 2022 at 14:48, Chengming Zhou
<zhouchengming@bytedance.com> wrote:
>
> When wake_up_new_task(), we use post_init_entity_util_avg() to init
> util_avg/runnable_avg based on cpu's util_avg at that time, and
> attach task sched_avg to cfs_rq.
>
> Since enqueue_task_fair() -> enqueue_entity() -> update_load_avg()
> loop will do attach, we can move this work to update_load_avg().
>
> wake_up_new_task(p)
>   post_init_entity_util_avg(p)
>     attach_entity_cfs_rq()  --> (1)
>   activate_task(rq, p)
>     enqueue_task() := enqueue_task_fair()
>       enqueue_entity() loop
>         update_load_avg(cfs_rq, se, UPDATE_TG | DO_ATTACH)
>           if (!se->avg.last_update_time && (flags & DO_ATTACH))
>             attach_entity_load_avg()  --> (2)
>
> This patch move attach from (1) to (2), update related comments too.
>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/fair.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c319b0bd2bc1..93d7c7b110dd 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -799,8 +799,6 @@ void init_entity_runnable_average(struct sched_entity *se)
>         /* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
>  }
>
> -static void attach_entity_cfs_rq(struct sched_entity *se);
> -
>  /*
>   * With new tasks being created, their initial util_avgs are extrapolated
>   * based on the cfs_rq's current util_avg:
> @@ -863,8 +861,6 @@ void post_init_entity_util_avg(struct task_struct *p)
>                 se->avg.last_update_time = cfs_rq_clock_pelt(cfs_rq);
>                 return;
>         }
> -
> -       attach_entity_cfs_rq(se);
>  }
>
>  #else /* !CONFIG_SMP */
> @@ -4002,8 +3998,7 @@ static void migrate_se_pelt_lag(struct sched_entity *se) {}
>   * @cfs_rq: cfs_rq to update
>   *
>   * The cfs_rq avg is the direct sum of all its entities (blocked and runnable)
> - * avg. The immediate corollary is that all (fair) tasks must be attached, see
> - * post_init_entity_util_avg().
> + * avg. The immediate corollary is that all (fair) tasks must be attached.
>   *
>   * cfs_rq->avg is used for task_h_load() and update_cfs_share() for example.
>   *
> @@ -4236,8 +4231,8 @@ static void remove_entity_load_avg(struct sched_entity *se)
>
>         /*
>          * tasks cannot exit without having gone through wake_up_new_task() ->
> -        * post_init_entity_util_avg() which will have added things to the
> -        * cfs_rq, so we can remove unconditionally.
> +        * enqueue_task_fair() which will have added things to the cfs_rq,
> +        * so we can remove unconditionally.
>          */
>
>         sync_entity_load_avg(se);
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task
  2022-08-18 12:48 ` [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task Chengming Zhou
@ 2022-08-23  7:49   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Don't " tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: Vincent Guittot @ 2022-08-23  7:49 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: dietmar.eggemann, mingo, peterz, rostedt, bsegall, vschneid,
	linux-kernel, tj

On Thu, 18 Aug 2022 at 14:48, Chengming Zhou
<zhouchengming@bytedance.com> wrote:
>
> post_init_entity_util_avg() init task util_avg according to the cpu util_avg
> at the time of fork, which will decay when switched_to_fair() some time later,
> we'd better to not set them at all in the case of !fair task.
>
> Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/fair.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 93d7c7b110dd..621bd19e10ae 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -833,20 +833,6 @@ void post_init_entity_util_avg(struct task_struct *p)
>         long cpu_scale = arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
>         long cap = (long)(cpu_scale - cfs_rq->avg.util_avg) / 2;
>
> -       if (cap > 0) {
> -               if (cfs_rq->avg.util_avg != 0) {
> -                       sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
> -                       sa->util_avg /= (cfs_rq->avg.load_avg + 1);
> -
> -                       if (sa->util_avg > cap)
> -                               sa->util_avg = cap;
> -               } else {
> -                       sa->util_avg = cap;
> -               }
> -       }
> -
> -       sa->runnable_avg = sa->util_avg;
> -
>         if (p->sched_class != &fair_sched_class) {
>                 /*
>                  * For !fair tasks do:
> @@ -861,6 +847,20 @@ void post_init_entity_util_avg(struct task_struct *p)
>                 se->avg.last_update_time = cfs_rq_clock_pelt(cfs_rq);
>                 return;
>         }
> +
> +       if (cap > 0) {
> +               if (cfs_rq->avg.util_avg != 0) {
> +                       sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
> +                       sa->util_avg /= (cfs_rq->avg.load_avg + 1);
> +
> +                       if (sa->util_avg > cap)
> +                               sa->util_avg = cap;
> +               } else {
> +                       sa->util_avg = cap;
> +               }
> +       }
> +
> +       sa->runnable_avg = sa->util_avg;
>  }
>
>  #else /* !CONFIG_SMP */
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task
  2022-08-18 12:48 ` [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task Chengming Zhou
@ 2022-08-23  7:54   ` Vincent Guittot
  2022-08-23  9:27   ` [tip: sched/core] sched/fair: Allow " tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: Vincent Guittot @ 2022-08-23  7:54 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: dietmar.eggemann, mingo, peterz, rostedt, bsegall, vschneid,
	linux-kernel, tj

On Thu, 18 Aug 2022 at 14:48, Chengming Zhou
<zhouchengming@bytedance.com> wrote:
>
> commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
> introduce a TASK_NEW state and an unnessary limitation that would fail
> when changing cgroup of new forked task.
>
> Because at that time, we can't handle task_change_group_fair() for new
> forked fair task which hasn't been woken up by wake_up_new_task(),
> which will cause detach on an unattached task sched_avg problem.
>
> This patch delete this unnessary limitation by adding check before do
> detach or attach in task_change_group_fair().
>
> So cpu_cgrp_subsys.can_attach() has nothing to do for fair tasks,
> only define it in #ifdef CONFIG_RT_GROUP_SCHED.
>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/core.c | 25 +++++--------------------
>  kernel/sched/fair.c |  7 +++++++
>  2 files changed, 12 insertions(+), 20 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8e3f1c3f0b2c..14819bd66021 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -10290,36 +10290,19 @@ static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
>         sched_unregister_group(tg);
>  }
>
> +#ifdef CONFIG_RT_GROUP_SCHED
>  static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
>  {
>         struct task_struct *task;
>         struct cgroup_subsys_state *css;
> -       int ret = 0;
>
>         cgroup_taskset_for_each(task, css, tset) {
> -#ifdef CONFIG_RT_GROUP_SCHED
>                 if (!sched_rt_can_attach(css_tg(css), task))
>                         return -EINVAL;
> -#endif
> -               /*
> -                * Serialize against wake_up_new_task() such that if it's
> -                * running, we're sure to observe its full state.
> -                */
> -               raw_spin_lock_irq(&task->pi_lock);
> -               /*
> -                * Avoid calling sched_move_task() before wake_up_new_task()
> -                * has happened. This would lead to problems with PELT, due to
> -                * move wanting to detach+attach while we're not attached yet.
> -                */
> -               if (READ_ONCE(task->__state) == TASK_NEW)
> -                       ret = -EINVAL;
> -               raw_spin_unlock_irq(&task->pi_lock);
> -
> -               if (ret)
> -                       break;
>         }
> -       return ret;
> +       return 0;
>  }
> +#endif
>
>  static void cpu_cgroup_attach(struct cgroup_taskset *tset)
>  {
> @@ -11155,7 +11138,9 @@ struct cgroup_subsys cpu_cgrp_subsys = {
>         .css_released   = cpu_cgroup_css_released,
>         .css_free       = cpu_cgroup_css_free,
>         .css_extra_stat_show = cpu_extra_stat_show,
> +#ifdef CONFIG_RT_GROUP_SCHED
>         .can_attach     = cpu_cgroup_can_attach,
> +#endif
>         .attach         = cpu_cgroup_attach,
>         .legacy_cftypes = cpu_legacy_files,
>         .dfl_cftypes    = cpu_files,
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index eba8a64f905a..c319b0bd2bc1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11840,6 +11840,13 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  static void task_change_group_fair(struct task_struct *p)
>  {
> +       /*
> +        * We couldn't detach or attach a forked task which
> +        * hasn't been woken up by wake_up_new_task().
> +        */
> +       if (READ_ONCE(p->__state) == TASK_NEW)
> +               return;
> +
>         detach_task_cfs_rq(p);
>
>  #ifdef CONFIG_SMP
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Don't init util/runnable_avg for !fair task
  2022-08-18 12:48 ` [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task Chengming Zhou
  2022-08-23  7:49   ` Vincent Guittot
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vincent Guittot, Chengming Zhou, Peter Zijlstra (Intel),
	x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e4fe074d6c359c19b74564fa1364fe48343cfa5d
Gitweb:        https://git.kernel.org/tip/e4fe074d6c359c19b74564fa1364fe48343cfa5d
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:05 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:20 +02:00

sched/fair: Don't init util/runnable_avg for !fair task

post_init_entity_util_avg() init task util_avg according to the cpu util_avg
at the time of fork, which will decay when switched_to_fair() some time later,
we'd better to not set them at all in the case of !fair task.

Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-10-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ef325b5..e8c1b88 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -833,20 +833,6 @@ void post_init_entity_util_avg(struct task_struct *p)
 	long cpu_scale = arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq)));
 	long cap = (long)(cpu_scale - cfs_rq->avg.util_avg) / 2;
 
-	if (cap > 0) {
-		if (cfs_rq->avg.util_avg != 0) {
-			sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
-			sa->util_avg /= (cfs_rq->avg.load_avg + 1);
-
-			if (sa->util_avg > cap)
-				sa->util_avg = cap;
-		} else {
-			sa->util_avg = cap;
-		}
-	}
-
-	sa->runnable_avg = sa->util_avg;
-
 	if (p->sched_class != &fair_sched_class) {
 		/*
 		 * For !fair tasks do:
@@ -861,6 +847,20 @@ void post_init_entity_util_avg(struct task_struct *p)
 		se->avg.last_update_time = cfs_rq_clock_pelt(cfs_rq);
 		return;
 	}
+
+	if (cap > 0) {
+		if (cfs_rq->avg.util_avg != 0) {
+			sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
+			sa->util_avg /= (cfs_rq->avg.load_avg + 1);
+
+			if (sa->util_avg > cap)
+				sa->util_avg = cap;
+		} else {
+			sa->util_avg = cap;
+		}
+	}
+
+	sa->runnable_avg = sa->util_avg;
 }
 
 #else /* !CONFIG_SMP */

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Move task sched_avg attach to enqueue_task_fair()
  2022-08-18 12:48 ` [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair() Chengming Zhou
  2022-08-23  7:48   ` Vincent Guittot
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     d6531ab6e50149ab2a144b0f4787cb9277d0893f
Gitweb:        https://git.kernel.org/tip/d6531ab6e50149ab2a144b0f4787cb9277d0893f
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:04 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:19 +02:00

sched/fair: Move task sched_avg attach to enqueue_task_fair()

When wake_up_new_task(), we use post_init_entity_util_avg() to init
util_avg/runnable_avg based on cpu's util_avg at that time, and
attach task sched_avg to cfs_rq.

Since enqueue_task_fair() -> enqueue_entity() -> update_load_avg()
loop will do attach, we can move this work to update_load_avg().

wake_up_new_task(p)
  post_init_entity_util_avg(p)
    attach_entity_cfs_rq()  --> (1)
  activate_task(rq, p)
    enqueue_task() := enqueue_task_fair()
      enqueue_entity() loop
        update_load_avg(cfs_rq, se, UPDATE_TG | DO_ATTACH)
          if (!se->avg.last_update_time && (flags & DO_ATTACH))
            attach_entity_load_avg()  --> (2)

This patch move attach from (1) to (2), update related comments too.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-9-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fd1aa4c..ef325b5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -799,8 +799,6 @@ void init_entity_runnable_average(struct sched_entity *se)
 	/* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
 }
 
-static void attach_entity_cfs_rq(struct sched_entity *se);
-
 /*
  * With new tasks being created, their initial util_avgs are extrapolated
  * based on the cfs_rq's current util_avg:
@@ -863,8 +861,6 @@ void post_init_entity_util_avg(struct task_struct *p)
 		se->avg.last_update_time = cfs_rq_clock_pelt(cfs_rq);
 		return;
 	}
-
-	attach_entity_cfs_rq(se);
 }
 
 #else /* !CONFIG_SMP */
@@ -3838,8 +3834,7 @@ static void migrate_se_pelt_lag(struct sched_entity *se) {}
  * @cfs_rq: cfs_rq to update
  *
  * The cfs_rq avg is the direct sum of all its entities (blocked and runnable)
- * avg. The immediate corollary is that all (fair) tasks must be attached, see
- * post_init_entity_util_avg().
+ * avg. The immediate corollary is that all (fair) tasks must be attached.
  *
  * cfs_rq->avg is used for task_h_load() and update_cfs_share() for example.
  *
@@ -4072,8 +4067,8 @@ static void remove_entity_load_avg(struct sched_entity *se)
 
 	/*
 	 * tasks cannot exit without having gone through wake_up_new_task() ->
-	 * post_init_entity_util_avg() which will have added things to the
-	 * cfs_rq, so we can remove unconditionally.
+	 * enqueue_task_fair() which will have added things to the cfs_rq,
+	 * so we can remove unconditionally.
 	 */
 
 	sync_entity_load_avg(se);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Allow changing cgroup of new forked task
  2022-08-18 12:48 ` [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task Chengming Zhou
  2022-08-23  7:54   ` Vincent Guittot
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     df16b71c686cb096774e30153c9ce6756450796c
Gitweb:        https://git.kernel.org/tip/df16b71c686cb096774e30153c9ce6756450796c
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:03 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:19 +02:00

sched/fair: Allow changing cgroup of new forked task

commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
introduce a TASK_NEW state and an unnessary limitation that would fail
when changing cgroup of new forked task.

Because at that time, we can't handle task_change_group_fair() for new
forked fair task which hasn't been woken up by wake_up_new_task(),
which will cause detach on an unattached task sched_avg problem.

This patch delete this unnessary limitation by adding check before do
detach or attach in task_change_group_fair().

So cpu_cgrp_subsys.can_attach() has nothing to do for fair tasks,
only define it in #ifdef CONFIG_RT_GROUP_SCHED.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-8-zhouchengming@bytedance.com
---
 kernel/sched/core.c | 25 +++++--------------------
 kernel/sched/fair.c |  7 +++++++
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e74e79f..603a80e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10238,36 +10238,19 @@ static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
 	sched_unregister_group(tg);
 }
 
+#ifdef CONFIG_RT_GROUP_SCHED
 static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 	struct cgroup_subsys_state *css;
-	int ret = 0;
 
 	cgroup_taskset_for_each(task, css, tset) {
-#ifdef CONFIG_RT_GROUP_SCHED
 		if (!sched_rt_can_attach(css_tg(css), task))
 			return -EINVAL;
-#endif
-		/*
-		 * Serialize against wake_up_new_task() such that if it's
-		 * running, we're sure to observe its full state.
-		 */
-		raw_spin_lock_irq(&task->pi_lock);
-		/*
-		 * Avoid calling sched_move_task() before wake_up_new_task()
-		 * has happened. This would lead to problems with PELT, due to
-		 * move wanting to detach+attach while we're not attached yet.
-		 */
-		if (READ_ONCE(task->__state) == TASK_NEW)
-			ret = -EINVAL;
-		raw_spin_unlock_irq(&task->pi_lock);
-
-		if (ret)
-			break;
 	}
-	return ret;
+	return 0;
 }
+#endif
 
 static void cpu_cgroup_attach(struct cgroup_taskset *tset)
 {
@@ -11103,7 +11086,9 @@ struct cgroup_subsys cpu_cgrp_subsys = {
 	.css_released	= cpu_cgroup_css_released,
 	.css_free	= cpu_cgroup_css_free,
 	.css_extra_stat_show = cpu_extra_stat_show,
+#ifdef CONFIG_RT_GROUP_SCHED
 	.can_attach	= cpu_cgroup_can_attach,
+#endif
 	.attach		= cpu_cgroup_attach,
 	.legacy_cftypes	= cpu_legacy_files,
 	.dfl_cftypes	= cpu_files,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e92bc05..fd1aa4c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11676,6 +11676,13 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 #ifdef CONFIG_FAIR_GROUP_SCHED
 static void task_change_group_fair(struct task_struct *p)
 {
+	/*
+	 * We couldn't detach or attach a forked task which
+	 * hasn't been woken up by wake_up_new_task().
+	 */
+	if (READ_ONCE(p->__state) == TASK_NEW)
+		return;
+
 	detach_task_cfs_rq(p);
 
 #ifdef CONFIG_SMP

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Fix another detach on unattached task corner case
  2022-08-18 12:48 ` [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case Chengming Zhou
  2022-08-23  7:06   ` Vincent Guittot
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     7e2edaf61814fb6aa363989d718950c023b882d4
Gitweb:        https://git.kernel.org/tip/7e2edaf61814fb6aa363989d718950c023b882d4
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:02 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:19 +02:00

sched/fair: Fix another detach on unattached task corner case

commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
fixed two load tracking problems for new task, including detach on
unattached new task problem.

There still left another detach on unattached task problem for the task
which has been woken up by try_to_wake_up() and waiting for actually
being woken up by sched_ttwu_pending().

try_to_wake_up(p)
  cpu = select_task_rq(p)
  if (task_cpu(p) != cpu)
    set_task_cpu(p, cpu)
      migrate_task_rq_fair()
        remove_entity_load_avg()       --> unattached
        se->avg.last_update_time = 0;
      __set_task_cpu()
  ttwu_queue(p, cpu)
    ttwu_queue_wakelist()
      __ttwu_queue_wakelist()

task_change_group_fair()
  detach_task_cfs_rq()
    detach_entity_cfs_rq()
      detach_entity_load_avg()   --> detach on unattached task
  set_task_rq()
  attach_task_cfs_rq()
    attach_entity_cfs_rq()
      attach_entity_load_avg()

The reason of this problem is similar, we should check in detach_entity_cfs_rq()
that se->avg.last_update_time != 0, before do detach_entity_load_avg().

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-7-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f52e7dc..e92bc05 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11557,6 +11557,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
+#ifdef CONFIG_SMP
+	/*
+	 * In case the task sched_avg hasn't been attached:
+	 * - A forked task which hasn't been woken up by wake_up_new_task().
+	 * - A task which has been woken up by try_to_wake_up() but is
+	 *   waiting for actually being woken up by sched_ttwu_pending().
+	 */
+	if (!se->avg.last_update_time)
+		return;
+#endif
+
 	/* Catch up with the cfs_rq and remove our load when we leave */
 	update_load_avg(cfs_rq, se, 0);
 	detach_entity_load_avg(cfs_rq, se);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Combine detach into dequeue when migrating task
  2022-08-18 12:48 ` [PATCH v6 5/9] sched/fair: combine detach into dequeue when migrating task Chengming Zhou
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e1f078f50478a51849453341e7356cb298df00cf
Gitweb:        https://git.kernel.org/tip/e1f078f50478a51849453341e7356cb298df00cf
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:01 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:18 +02:00

sched/fair: Combine detach into dequeue when migrating task

When we are migrating task out of the CPU, we can combine detach and
propagation into dequeue_entity() to save the detach_entity_cfs_rq()
in migrate_task_rq_fair().

This optimization is like combining DO_ATTACH in the enqueue_entity()
when migrating task to the CPU. So we don't have to traverse the CFS tree
extra time to do the detach_entity_cfs_rq() -> propagate_entity_cfs_rq(),
which wouldn't be called anymore with this patch's change.

detach_task()
  deactivate_task()
    dequeue_task_fair()
      for_each_sched_entity(se)
        dequeue_entity()
          update_load_avg() /* (1) */
            detach_entity_load_avg()

  set_task_cpu()
    migrate_task_rq_fair()
      detach_entity_cfs_rq() /* (2) */
        update_load_avg();
        detach_entity_load_avg();
        propagate_entity_cfs_rq();
          for_each_sched_entity()
            update_load_avg()

This patch save the detach_entity_cfs_rq() called in (2) by doing
the detach_entity_load_avg() for a CPU migrating task inside (1)
(the task being the first se in the loop)

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-6-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 52de830..f52e7dc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4003,6 +4003,7 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 #define UPDATE_TG	0x1
 #define SKIP_AGE_LOAD	0x2
 #define DO_ATTACH	0x4
+#define DO_DETACH	0x8
 
 /* Update task and its cfs_rq load average */
 static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
@@ -4032,6 +4033,13 @@ static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 		attach_entity_load_avg(cfs_rq, se);
 		update_tg_load_avg(cfs_rq);
 
+	} else if (flags & DO_DETACH) {
+		/*
+		 * DO_DETACH means we're here from dequeue_entity()
+		 * and we are migrating task out of the CPU.
+		 */
+		detach_entity_load_avg(cfs_rq, se);
+		update_tg_load_avg(cfs_rq);
 	} else if (decayed) {
 		cfs_rq_util_change(cfs_rq, 0);
 
@@ -4292,6 +4300,7 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq)
 #define UPDATE_TG	0x0
 #define SKIP_AGE_LOAD	0x0
 #define DO_ATTACH	0x0
+#define DO_DETACH	0x0
 
 static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int not_used1)
 {
@@ -4512,6 +4521,11 @@ static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq);
 static void
 dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 {
+	int action = UPDATE_TG;
+
+	if (entity_is_task(se) && task_on_rq_migrating(task_of(se)))
+		action |= DO_DETACH;
+
 	/*
 	 * Update run-time statistics of the 'current'.
 	 */
@@ -4526,7 +4540,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	 *   - For group entity, update its weight to reflect the new share
 	 *     of its group cfs_rq.
 	 */
-	update_load_avg(cfs_rq, se, UPDATE_TG);
+	update_load_avg(cfs_rq, se, action);
 	se_update_runnable(se);
 
 	update_stats_dequeue_fair(cfs_rq, se, flags);
@@ -7078,8 +7092,6 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
 	return new_cpu;
 }
 
-static void detach_entity_cfs_rq(struct sched_entity *se);
-
 /*
  * Called immediately before a task is migrated to a new CPU; task_cpu(p) and
  * cfs_rq_of(p) references at time of call are still valid and identify the
@@ -7101,15 +7113,7 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
 		se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
 	}
 
-	if (p->on_rq == TASK_ON_RQ_MIGRATING) {
-		/*
-		 * In case of TASK_ON_RQ_MIGRATING we in fact hold the 'old'
-		 * rq->lock and can modify state directly.
-		 */
-		lockdep_assert_rq_held(task_rq(p));
-		detach_entity_cfs_rq(se);
-
-	} else {
+	if (!task_on_rq_migrating(p)) {
 		remove_entity_load_avg(se);
 
 		/*

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Update comments in enqueue/dequeue_entity()
  2022-08-18 12:48 ` [PATCH v6 4/9] sched/fair: update comments in enqueue/dequeue_entity() Chengming Zhou
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     859f206290f345c151a6005de639ba9677bf3e18
Gitweb:        https://git.kernel.org/tip/859f206290f345c151a6005de639ba9677bf3e18
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:00 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:18 +02:00

sched/fair: Update comments in enqueue/dequeue_entity()

When reading the sched_avg related code, I found the comments in
enqueue/dequeue_entity() are not updated with the current code.

We don't add/subtract entity's runnable_avg from cfs_rq->runnable_avg
during enqueue/dequeue_entity(), those are done only for attach/detach.

This patch updates the comments to reflect the current code working.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-5-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e4c0929..52de830 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4434,7 +4434,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	/*
 	 * When enqueuing a sched_entity, we must:
 	 *   - Update loads to have both entity and cfs_rq synced with now.
-	 *   - Add its load to cfs_rq->runnable_avg
+	 *   - For group_entity, update its runnable_weight to reflect the new
+	 *     h_nr_running of its group cfs_rq.
 	 *   - For group_entity, update its weight to reflect the new share of
 	 *     its group cfs_rq
 	 *   - Add its new weight to cfs_rq->load.weight
@@ -4519,7 +4520,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	/*
 	 * When dequeuing a sched_entity, we must:
 	 *   - Update loads to have both entity and cfs_rq synced with now.
-	 *   - Subtract its load from the cfs_rq->runnable_avg.
+	 *   - For group_entity, update its runnable_weight to reflect the new
+	 *     h_nr_running of its group cfs_rq.
 	 *   - Subtract its previous weight from cfs_rq->load.weight.
 	 *   - For group entity, update its weight to reflect the new share
 	 *     of its group cfs_rq.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Reset sched_avg last_update_time before set_task_rq()
  2022-08-18 12:47 ` [PATCH v6 3/9] sched/fair: reset sched_avg last_update_time before set_task_rq() Chengming Zhou
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Dietmar Eggemann, Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     5d6da83c44af70ede7bfd0fd6d1ef8a3b3e0402c
Gitweb:        https://git.kernel.org/tip/5d6da83c44af70ede7bfd0fd6d1ef8a3b3e0402c
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:47:59 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:18 +02:00

sched/fair: Reset sched_avg last_update_time before set_task_rq()

set_task_rq() -> set_task_rq_fair() will try to synchronize the blocked
task's sched_avg when migrate, which is not needed for already detached
task.

task_change_group_fair() will detached the task sched_avg from prev cfs_rq
first, so reset sched_avg last_update_time before set_task_rq() to avoid that.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-4-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2c0eb2a..e4c0929 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11660,12 +11660,12 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 static void task_change_group_fair(struct task_struct *p)
 {
 	detach_task_cfs_rq(p);
-	set_task_rq(p, task_cpu(p));
 
 #ifdef CONFIG_SMP
 	/* Tell se's cfs_rq has been changed -- migrated */
 	p->se.avg.last_update_time = 0;
 #endif
+	set_task_rq(p, task_cpu(p));
 	attach_task_cfs_rq(p);
 }
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Remove redundant cpu_cgrp_subsys->fork()
  2022-08-18 12:47 ` [PATCH v6 2/9] sched/fair: remove redundant cpu_cgrp_subsys->fork() Chengming Zhou
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Vincent Guittot, Dietmar Eggemann, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     39c4261191bf05e7eb310f852980a6d0afe5582a
Gitweb:        https://git.kernel.org/tip/39c4261191bf05e7eb310f852980a6d0afe5582a
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:47:58 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:17 +02:00

sched/fair: Remove redundant cpu_cgrp_subsys->fork()

We use cpu_cgrp_subsys->fork() to set task group for the new fair task
in cgroup_post_fork().

Since commit b1e8206582f9 ("sched: Fix yet more sched_fork() races")
has already set_task_rq() for the new fair task in sched_cgroup_fork(),
so cpu_cgrp_subsys->fork() can be removed.

  cgroup_can_fork()	--> pin parent's sched_task_group
  sched_cgroup_fork()
    __set_task_cpu()
      set_task_rq()
  cgroup_post_fork()
    ss->fork() := cpu_cgroup_fork()
      sched_change_group(..., TASK_SET_GROUP)
        task_set_group_fair()
          set_task_rq()  --> can be removed

After this patch's change, task_change_group_fair() only need to
care about task cgroup migration, make the code much simplier.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20220818124805.601-3-zhouchengming@bytedance.com
---
 kernel/sched/core.c  | 27 ++++-----------------------
 kernel/sched/fair.c  | 23 +----------------------
 kernel/sched/sched.h |  5 +----
 3 files changed, 6 insertions(+), 49 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 64c0899..e74e79f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -481,8 +481,7 @@ sched_core_dequeue(struct rq *rq, struct task_struct *p, int flags) { }
  *				p->se.load, p->rt_priority,
  *				p->dl.dl_{runtime, deadline, period, flags, bw, density}
  *  - sched_setnuma():		p->numa_preferred_nid
- *  - sched_move_task()/
- *    cpu_cgroup_fork():	p->sched_task_group
+ *  - sched_move_task():	p->sched_task_group
  *  - uclamp_update_active()	p->uclamp*
  *
  * p->state <- TASK_*:
@@ -10114,7 +10113,7 @@ void sched_release_group(struct task_group *tg)
 	spin_unlock_irqrestore(&task_group_lock, flags);
 }
 
-static void sched_change_group(struct task_struct *tsk, int type)
+static void sched_change_group(struct task_struct *tsk)
 {
 	struct task_group *tg;
 
@@ -10130,7 +10129,7 @@ static void sched_change_group(struct task_struct *tsk, int type)
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	if (tsk->sched_class->task_change_group)
-		tsk->sched_class->task_change_group(tsk, type);
+		tsk->sched_class->task_change_group(tsk);
 	else
 #endif
 		set_task_rq(tsk, task_cpu(tsk));
@@ -10161,7 +10160,7 @@ void sched_move_task(struct task_struct *tsk)
 	if (running)
 		put_prev_task(rq, tsk);
 
-	sched_change_group(tsk, TASK_MOVE_GROUP);
+	sched_change_group(tsk);
 
 	if (queued)
 		enqueue_task(rq, tsk, queue_flags);
@@ -10239,23 +10238,6 @@ static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
 	sched_unregister_group(tg);
 }
 
-/*
- * This is called before wake_up_new_task(), therefore we really only
- * have to set its group bits, all the other stuff does not apply.
- */
-static void cpu_cgroup_fork(struct task_struct *task)
-{
-	struct rq_flags rf;
-	struct rq *rq;
-
-	rq = task_rq_lock(task, &rf);
-
-	update_rq_clock(rq);
-	sched_change_group(task, TASK_SET_GROUP);
-
-	task_rq_unlock(rq, task, &rf);
-}
-
 static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
@@ -11121,7 +11103,6 @@ struct cgroup_subsys cpu_cgrp_subsys = {
 	.css_released	= cpu_cgroup_css_released,
 	.css_free	= cpu_cgroup_css_free,
 	.css_extra_stat_show = cpu_extra_stat_show,
-	.fork		= cpu_cgroup_fork,
 	.can_attach	= cpu_cgroup_can_attach,
 	.attach		= cpu_cgroup_attach,
 	.legacy_cftypes	= cpu_legacy_files,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a3b0f8b..2c0eb2a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11657,15 +11657,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 }
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
-static void task_set_group_fair(struct task_struct *p)
-{
-	struct sched_entity *se = &p->se;
-
-	set_task_rq(p, task_cpu(p));
-	se->depth = se->parent ? se->parent->depth + 1 : 0;
-}
-
-static void task_move_group_fair(struct task_struct *p)
+static void task_change_group_fair(struct task_struct *p)
 {
 	detach_task_cfs_rq(p);
 	set_task_rq(p, task_cpu(p));
@@ -11677,19 +11669,6 @@ static void task_move_group_fair(struct task_struct *p)
 	attach_task_cfs_rq(p);
 }
 
-static void task_change_group_fair(struct task_struct *p, int type)
-{
-	switch (type) {
-	case TASK_SET_GROUP:
-		task_set_group_fair(p);
-		break;
-
-	case TASK_MOVE_GROUP:
-		task_move_group_fair(p);
-		break;
-	}
-}
-
 void free_fair_sched_group(struct task_group *tg)
 {
 	int i;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4c48221..74130a6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2193,11 +2193,8 @@ struct sched_class {
 
 	void (*update_curr)(struct rq *rq);
 
-#define TASK_SET_GROUP		0
-#define TASK_MOVE_GROUP		1
-
 #ifdef CONFIG_FAIR_GROUP_SCHED
-	void (*task_change_group)(struct task_struct *p, int type);
+	void (*task_change_group)(struct task_struct *p);
 #endif
 };
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: sched/core] sched/fair: Maintain task se depth in set_task_rq()
  2022-08-18 12:47 ` [PATCH v6 1/9] sched/fair: maintain task se depth in set_task_rq() Chengming Zhou
@ 2022-08-23  9:27   ` tip-bot2 for Chengming Zhou
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Chengming Zhou @ 2022-08-23  9:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chengming Zhou, Peter Zijlstra (Intel),
	Dietmar Eggemann, Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     78b6b15770618efb60d84e2d605f6b93dc94051b
Gitweb:        https://git.kernel.org/tip/78b6b15770618efb60d84e2d605f6b93dc94051b
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:47:57 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:17 +02:00

sched/fair: Maintain task se depth in set_task_rq()

Previously we only maintain task se depth in task_move_group_fair(),
if a !fair task change task group, its se depth will not be updated,
so commit eb7a59b2c888 ("sched/fair: Reset se-depth when task switched to FAIR")
fix the problem by updating se depth in switched_to_fair() too.

Then commit daa59407b558 ("sched/fair: Unify switched_{from,to}_fair()
and task_move_group_fair()") unified these two functions, moved se.depth
setting to attach_task_cfs_rq(), which further into attach_entity_cfs_rq()
with commit df217913e72e ("sched/fair: Factorize attach/detach entity").

This patch move task se depth maintenance from attach_entity_cfs_rq()
to set_task_rq(), which will be called when CPU/cgroup change, so its
depth will always be correct.

This patch is preparation for the next patch.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-2-zhouchengming@bytedance.com
---
 kernel/sched/fair.c  | 8 --------
 kernel/sched/sched.h | 1 +
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index da38865..a3b0f8b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11562,14 +11562,6 @@ static void attach_entity_cfs_rq(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
-#ifdef CONFIG_FAIR_GROUP_SCHED
-	/*
-	 * Since the real-depth could have been changed (only FAIR
-	 * class maintain depth value), reset depth properly.
-	 */
-	se->depth = se->parent ? se->parent->depth + 1 : 0;
-#endif
-
 	/* Synchronize entity with its cfs_rq */
 	update_load_avg(cfs_rq, se, sched_feat(ATTACH_AGE_LOAD) ? 0 : SKIP_AGE_LOAD);
 	attach_entity_load_avg(cfs_rq, se);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3ccd35c..4c48221 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1930,6 +1930,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
 	set_task_rq_fair(&p->se, p->se.cfs_rq, tg->cfs_rq[cpu]);
 	p->se.cfs_rq = tg->cfs_rq[cpu];
 	p->se.parent = tg->se[cpu];
+	p->se.depth = tg->se[cpu] ? tg->se[cpu]->depth + 1 : 0;
 #endif
 
 #ifdef CONFIG_RT_GROUP_SCHED

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-08-23 11:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-18 12:47 [PATCH v6 0/9] sched/fair: task load tracking optimization and cleanup Chengming Zhou
2022-08-18 12:47 ` [PATCH v6 1/9] sched/fair: maintain task se depth in set_task_rq() Chengming Zhou
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Maintain " tip-bot2 for Chengming Zhou
2022-08-18 12:47 ` [PATCH v6 2/9] sched/fair: remove redundant cpu_cgrp_subsys->fork() Chengming Zhou
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Chengming Zhou
2022-08-18 12:47 ` [PATCH v6 3/9] sched/fair: reset sched_avg last_update_time before set_task_rq() Chengming Zhou
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Reset " tip-bot2 for Chengming Zhou
2022-08-18 12:48 ` [PATCH v6 4/9] sched/fair: update comments in enqueue/dequeue_entity() Chengming Zhou
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Update " tip-bot2 for Chengming Zhou
2022-08-18 12:48 ` [PATCH v6 5/9] sched/fair: combine detach into dequeue when migrating task Chengming Zhou
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Combine " tip-bot2 for Chengming Zhou
2022-08-18 12:48 ` [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case Chengming Zhou
2022-08-23  7:06   ` Vincent Guittot
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Fix " tip-bot2 for Chengming Zhou
2022-08-18 12:48 ` [PATCH v6 7/9] sched/fair: allow changing cgroup of new forked task Chengming Zhou
2022-08-23  7:54   ` Vincent Guittot
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Allow " tip-bot2 for Chengming Zhou
2022-08-18 12:48 ` [PATCH v6 8/9] sched/fair: move task sched_avg attach to enqueue_task_fair() Chengming Zhou
2022-08-23  7:48   ` Vincent Guittot
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Move " tip-bot2 for Chengming Zhou
2022-08-18 12:48 ` [PATCH v6 9/9] sched/fair: don't init util/runnable_avg for !fair task Chengming Zhou
2022-08-23  7:49   ` Vincent Guittot
2022-08-23  9:27   ` [tip: sched/core] sched/fair: Don't " tip-bot2 for Chengming Zhou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.