All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib
@ 2021-05-27 12:29 ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 12:29 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel, odin, cgroups
  Cc: Vincent Guittot

Odin reported some fairness problem between cgroup because of stalled
value in cfs_rq->tg_load_avg_contrib:

https://lkml.org/lkml/2021/5/18/566


2 problems generated this situation:
-1st: After propagating load in the hierarchy, load_sum can be null
 whereas load_avg isn't so the cfs_rq is removed whereas it still
 contribute to th tg's load
-2nd: cfs_rq->tg_load_avg_contrib was not always updated after
 significant changes like becoming null because cfs_rq had already
 been updated when propagating a child load.


Vincent Guittot (2):
  sched/fair: keep load_avg and load_sum synced
  sched/fair: make sure to update tg contrib for blocked load

 kernel/sched/fair.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib
@ 2021-05-27 12:29 ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 12:29 UTC (permalink / raw)
  To: mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	juri.lelli-H+wXaHxf7aLQT0dZR+AlfA, dietmar.eggemann-5wv7dgnIgG8,
	rostedt-nx8X9YLhiw1AfugRpC6u6w, bsegall-hpIqsD4AKlfQT0dZR+AlfA,
	mgorman-l3A5Bk7waGM, bristot-H+wXaHxf7aLQT0dZR+AlfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, odin-RObV4cXtwVA,
	cgroups-u79uwXL29TY76Z2rM5mHXA
  Cc: Vincent Guittot

Odin reported some fairness problem between cgroup because of stalled
value in cfs_rq->tg_load_avg_contrib:

https://lkml.org/lkml/2021/5/18/566


2 problems generated this situation:
-1st: After propagating load in the hierarchy, load_sum can be null
 whereas load_avg isn't so the cfs_rq is removed whereas it still
 contribute to th tg's load
-2nd: cfs_rq->tg_load_avg_contrib was not always updated after
 significant changes like becoming null because cfs_rq had already
 been updated when propagating a child load.


Vincent Guittot (2):
  sched/fair: keep load_avg and load_sum synced
  sched/fair: make sure to update tg contrib for blocked load

 kernel/sched/fair.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 12:29   ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 12:29 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel, odin, cgroups
  Cc: Vincent Guittot

when removing a cfs_rq from the list we only check _sum value so we must
ensure that _avg and _sum stay synced so load_sum can't be null whereas
load_avg is not after propagating load in the cgroup hierarchy.

Use load_avg to compute load_sum similarly to what is done for util_sum
and runnable_sum.

Fixes: 0e2d2aaaae52 ("sched/fair: Rewrite PELT migration propagation")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 161b92aa1c79..2859545d95fb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3509,7 +3509,8 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
 	se->avg.load_sum = runnable_sum;
 	se->avg.load_avg = load_avg;
 	add_positive(&cfs_rq->avg.load_avg, delta_avg);
-	add_positive(&cfs_rq->avg.load_sum, delta_sum);
+	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
+
 }
 
 static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 12:29   ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 12:29 UTC (permalink / raw)
  To: mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	juri.lelli-H+wXaHxf7aLQT0dZR+AlfA, dietmar.eggemann-5wv7dgnIgG8,
	rostedt-nx8X9YLhiw1AfugRpC6u6w, bsegall-hpIqsD4AKlfQT0dZR+AlfA,
	mgorman-l3A5Bk7waGM, bristot-H+wXaHxf7aLQT0dZR+AlfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, odin-RObV4cXtwVA,
	cgroups-u79uwXL29TY76Z2rM5mHXA
  Cc: Vincent Guittot

when removing a cfs_rq from the list we only check _sum value so we must
ensure that _avg and _sum stay synced so load_sum can't be null whereas
load_avg is not after propagating load in the cgroup hierarchy.

Use load_avg to compute load_sum similarly to what is done for util_sum
and runnable_sum.

Fixes: 0e2d2aaaae52 ("sched/fair: Rewrite PELT migration propagation")
Reported-by: Odin Ugedal <odin-RObV4cXtwVA@public.gmane.org>
Signed-off-by: Vincent Guittot <vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 kernel/sched/fair.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 161b92aa1c79..2859545d95fb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3509,7 +3509,8 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
 	se->avg.load_sum = runnable_sum;
 	se->avg.load_avg = load_avg;
 	add_positive(&cfs_rq->avg.load_avg, delta_avg);
-	add_positive(&cfs_rq->avg.load_sum, delta_sum);
+	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
+
 }
 
 static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] sched/fair: make sure to update tg contrib for blocked load
@ 2021-05-27 12:29   ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 12:29 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel, odin, cgroups
  Cc: Vincent Guittot

During the update of fair blocked load (__update_blocked_fair()), we update
the contribution of the cfs in tg->load_avg if cfs_rq's pelt has decayed.
Nevertheless, the pelt values of a cfs_rq could have been recently updated
while propagating the change of a child. In uchthis case, cfs_rq's pelt will
not decayed because it has already been updated and we don't update
tg->load_avg.

__update_blocked_fair
  ...
  for_each_leaf_cfs_rq_safe: child cfs_rq
    update cfs_rq_load_avg() for child cfs_rq
    ...
    update_load_avg(cfs_rq_of(se), se, 0)
      ...
      update cfs_rq_load_avg() for parent cfs_rq
		-propagation of child's load makes parent cfs_rq->load_sum
		 becoming null
        -UPDATE_TG is not set so it doesn't update parent
		 cfs_rq->tg_load_avg_contrib
  ..
  for_each_leaf_cfs_rq_safe: parent cfs_rq
    update cfs_rq_load_avg() for parent cfs_rq
      - nothing to do because parent cfs_rq has already been updated
		recently so cfs_rq->tg_load_avg_contrib is not updated
    ...
    parent cfs_rq is decayed
      list_del_leaf_cfs_rq parent cfs_rq
	  - but it still contibutes to tg->load_avg

we must set UPDATE_TG flags when propagting pending load to the parent

Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2859545d95fb..dcb3b1a6813c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8048,7 +8048,7 @@ static bool __update_blocked_fair(struct rq *rq, bool *done)
 		/* Propagate pending load changes to the parent, if any: */
 		se = cfs_rq->tg->se[cpu];
 		if (se && !skip_blocked_update(se))
-			update_load_avg(cfs_rq_of(se), se, 0);
+			update_load_avg(cfs_rq_of(se), se, UPDATE_TG);
 
 		/*
 		 * There can be a lot of idle CPU cgroups.  Don't let fully
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] sched/fair: make sure to update tg contrib for blocked load
@ 2021-05-27 12:29   ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 12:29 UTC (permalink / raw)
  To: mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	juri.lelli-H+wXaHxf7aLQT0dZR+AlfA, dietmar.eggemann-5wv7dgnIgG8,
	rostedt-nx8X9YLhiw1AfugRpC6u6w, bsegall-hpIqsD4AKlfQT0dZR+AlfA,
	mgorman-l3A5Bk7waGM, bristot-H+wXaHxf7aLQT0dZR+AlfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, odin-RObV4cXtwVA,
	cgroups-u79uwXL29TY76Z2rM5mHXA
  Cc: Vincent Guittot

During the update of fair blocked load (__update_blocked_fair()), we update
the contribution of the cfs in tg->load_avg if cfs_rq's pelt has decayed.
Nevertheless, the pelt values of a cfs_rq could have been recently updated
while propagating the change of a child. In uchthis case, cfs_rq's pelt will
not decayed because it has already been updated and we don't update
tg->load_avg.

__update_blocked_fair
  ...
  for_each_leaf_cfs_rq_safe: child cfs_rq
    update cfs_rq_load_avg() for child cfs_rq
    ...
    update_load_avg(cfs_rq_of(se), se, 0)
      ...
      update cfs_rq_load_avg() for parent cfs_rq
		-propagation of child's load makes parent cfs_rq->load_sum
		 becoming null
        -UPDATE_TG is not set so it doesn't update parent
		 cfs_rq->tg_load_avg_contrib
  ..
  for_each_leaf_cfs_rq_safe: parent cfs_rq
    update cfs_rq_load_avg() for parent cfs_rq
      - nothing to do because parent cfs_rq has already been updated
		recently so cfs_rq->tg_load_avg_contrib is not updated
    ...
    parent cfs_rq is decayed
      list_del_leaf_cfs_rq parent cfs_rq
	  - but it still contibutes to tg->load_avg

we must set UPDATE_TG flags when propagting pending load to the parent

Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Reported-by: Odin Ugedal <odin-RObV4cXtwVA@public.gmane.org>
Signed-off-by: Vincent Guittot <vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2859545d95fb..dcb3b1a6813c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8048,7 +8048,7 @@ static bool __update_blocked_fair(struct rq *rq, bool *done)
 		/* Propagate pending load changes to the parent, if any: */
 		se = cfs_rq->tg->se[cpu];
 		if (se && !skip_blocked_update(se))
-			update_load_avg(cfs_rq_of(se), se, 0);
+			update_load_avg(cfs_rq_of(se), se, UPDATE_TG);
 
 		/*
 		 * There can be a lot of idle CPU cgroups.  Don't let fully
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:05     ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2021-05-27 13:05 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, linux-kernel, odin, cgroups

On Thu, May 27, 2021 at 02:29:15PM +0200, Vincent Guittot wrote:

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3509,7 +3509,8 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
>  	se->avg.load_sum = runnable_sum;
>  	se->avg.load_avg = load_avg;
>  	add_positive(&cfs_rq->avg.load_avg, delta_avg);
> -	add_positive(&cfs_rq->avg.load_sum, delta_sum);
> +	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;

If I'm not mistaken, this makes delta_sum unused, so we can remove it
entirely, see below.

> +

This extra blank space, we really need that? :-)

>  }

---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 161b92aa1c79..2b99e687fe7a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3453,10 +3453,9 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
 static inline void
 update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
 {
-	long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
+	long running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
 	unsigned long load_avg;
 	u64 load_sum = 0;
-	s64 delta_sum;
 	u32 divider;
 
 	if (!runnable_sum)
@@ -3503,13 +3502,11 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
 	load_sum = (s64)se_weight(se) * runnable_sum;
 	load_avg = div_s64(load_sum, divider);
 
-	delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
-	delta_avg = load_avg - se->avg.load_avg;
-
 	se->avg.load_sum = runnable_sum;
 	se->avg.load_avg = load_avg;
-	add_positive(&cfs_rq->avg.load_avg, delta_avg);
-	add_positive(&cfs_rq->avg.load_sum, delta_sum);
+
+	add_positive(&cfs_rq->avg.load_avg, (long)(load_avg - se->avg.load_avg));
+	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
 }
 
 static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:05     ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2021-05-27 13:05 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo-H+wXaHxf7aLQT0dZR+AlfA, juri.lelli-H+wXaHxf7aLQT0dZR+AlfA,
	dietmar.eggemann-5wv7dgnIgG8, rostedt-nx8X9YLhiw1AfugRpC6u6w,
	bsegall-hpIqsD4AKlfQT0dZR+AlfA, mgorman-l3A5Bk7waGM,
	bristot-H+wXaHxf7aLQT0dZR+AlfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, odin-RObV4cXtwVA,
	cgroups-u79uwXL29TY76Z2rM5mHXA

On Thu, May 27, 2021 at 02:29:15PM +0200, Vincent Guittot wrote:

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3509,7 +3509,8 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
>  	se->avg.load_sum = runnable_sum;
>  	se->avg.load_avg = load_avg;
>  	add_positive(&cfs_rq->avg.load_avg, delta_avg);
> -	add_positive(&cfs_rq->avg.load_sum, delta_sum);
> +	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;

If I'm not mistaken, this makes delta_sum unused, so we can remove it
entirely, see below.

> +

This extra blank space, we really need that? :-)

>  }

---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 161b92aa1c79..2b99e687fe7a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3453,10 +3453,9 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
 static inline void
 update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
 {
-	long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
+	long running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
 	unsigned long load_avg;
 	u64 load_sum = 0;
-	s64 delta_sum;
 	u32 divider;
 
 	if (!runnable_sum)
@@ -3503,13 +3502,11 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
 	load_sum = (s64)se_weight(se) * runnable_sum;
 	load_avg = div_s64(load_sum, divider);
 
-	delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
-	delta_avg = load_avg - se->avg.load_avg;
-
 	se->avg.load_sum = runnable_sum;
 	se->avg.load_avg = load_avg;
-	add_positive(&cfs_rq->avg.load_avg, delta_avg);
-	add_positive(&cfs_rq->avg.load_sum, delta_sum);
+
+	add_positive(&cfs_rq->avg.load_avg, (long)(load_avg - se->avg.load_avg));
+	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
 }
 
 static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:18       ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Odin Ugedal, open list:CONTROL GROUP (CGROUP)

On Thu, 27 May 2021 at 15:06, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, May 27, 2021 at 02:29:15PM +0200, Vincent Guittot wrote:
>
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3509,7 +3509,8 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
> >       se->avg.load_sum = runnable_sum;
> >       se->avg.load_avg = load_avg;
> >       add_positive(&cfs_rq->avg.load_avg, delta_avg);
> > -     add_positive(&cfs_rq->avg.load_sum, delta_sum);
> > +     cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
>
> If I'm not mistaken, this makes delta_sum unused, so we can remove it
> entirely, see below.

yes, you're right

>
> > +
>
> This extra blank space, we really need that? :-)
>
> >  }
>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 161b92aa1c79..2b99e687fe7a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3453,10 +3453,9 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
>  static inline void
>  update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
>  {
> -       long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
> +       long running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
>         unsigned long load_avg;
>         u64 load_sum = 0;
> -       s64 delta_sum;
>         u32 divider;
>
>         if (!runnable_sum)
> @@ -3503,13 +3502,11 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
>         load_sum = (s64)se_weight(se) * runnable_sum;
>         load_avg = div_s64(load_sum, divider);
>
> -       delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
> -       delta_avg = load_avg - se->avg.load_avg;
> -
>         se->avg.load_sum = runnable_sum;
>         se->avg.load_avg = load_avg;
> -       add_positive(&cfs_rq->avg.load_avg, delta_avg);
> -       add_positive(&cfs_rq->avg.load_sum, delta_sum);
> +
> +       add_positive(&cfs_rq->avg.load_avg, (long)(load_avg - se->avg.load_avg));

you have to keep:
delta_avg = load_avg - se->avg.load_avg
or move se->avg.load_avg = load_avg after
add_positive(&cfs_rq->avg.load_avg, ..);
because otherwise (load_avg - se->avg.load_avg) == 0


> +       cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
>  }
>
>  static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:18       ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Odin Ugedal, open list:CONTROL GROUP (CGROUP)

On Thu, 27 May 2021 at 15:06, Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
>
> On Thu, May 27, 2021 at 02:29:15PM +0200, Vincent Guittot wrote:
>
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3509,7 +3509,8 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
> >       se->avg.load_sum = runnable_sum;
> >       se->avg.load_avg = load_avg;
> >       add_positive(&cfs_rq->avg.load_avg, delta_avg);
> > -     add_positive(&cfs_rq->avg.load_sum, delta_sum);
> > +     cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
>
> If I'm not mistaken, this makes delta_sum unused, so we can remove it
> entirely, see below.

yes, you're right

>
> > +
>
> This extra blank space, we really need that? :-)
>
> >  }
>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 161b92aa1c79..2b99e687fe7a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3453,10 +3453,9 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
>  static inline void
>  update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
>  {
> -       long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
> +       long running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
>         unsigned long load_avg;
>         u64 load_sum = 0;
> -       s64 delta_sum;
>         u32 divider;
>
>         if (!runnable_sum)
> @@ -3503,13 +3502,11 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
>         load_sum = (s64)se_weight(se) * runnable_sum;
>         load_avg = div_s64(load_sum, divider);
>
> -       delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
> -       delta_avg = load_avg - se->avg.load_avg;
> -
>         se->avg.load_sum = runnable_sum;
>         se->avg.load_avg = load_avg;
> -       add_positive(&cfs_rq->avg.load_avg, delta_avg);
> -       add_positive(&cfs_rq->avg.load_sum, delta_sum);
> +
> +       add_positive(&cfs_rq->avg.load_avg, (long)(load_avg - se->avg.load_avg));

you have to keep:
delta_avg = load_avg - se->avg.load_avg
or move se->avg.load_avg = load_avg after
add_positive(&cfs_rq->avg.load_avg, ..);
because otherwise (load_avg - se->avg.load_avg) == 0


> +       cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
>  }
>
>  static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:37         ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2021-05-27 13:37 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Odin Ugedal, open list:CONTROL GROUP (CGROUP)

On Thu, May 27, 2021 at 03:18:35PM +0200, Vincent Guittot wrote:
> > -       delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
> > -       delta_avg = load_avg - se->avg.load_avg;
> > -
> >         se->avg.load_sum = runnable_sum;
> >         se->avg.load_avg = load_avg;
> > -       add_positive(&cfs_rq->avg.load_avg, delta_avg);
> > -       add_positive(&cfs_rq->avg.load_sum, delta_sum);
> > +
> > +       add_positive(&cfs_rq->avg.load_avg, (long)(load_avg - se->avg.load_avg));
> 
> you have to keep:
> delta_avg = load_avg - se->avg.load_avg
> or move se->avg.load_avg = load_avg after
> add_positive(&cfs_rq->avg.load_avg, ..);
> because otherwise (load_avg - se->avg.load_avg) == 0

Duh. /me goes fix.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:37         ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2021-05-27 13:37 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Odin Ugedal, open list:CONTROL GROUP (CGROUP)

On Thu, May 27, 2021 at 03:18:35PM +0200, Vincent Guittot wrote:
> > -       delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
> > -       delta_avg = load_avg - se->avg.load_avg;
> > -
> >         se->avg.load_sum = runnable_sum;
> >         se->avg.load_avg = load_avg;
> > -       add_positive(&cfs_rq->avg.load_avg, delta_avg);
> > -       add_positive(&cfs_rq->avg.load_sum, delta_sum);
> > +
> > +       add_positive(&cfs_rq->avg.load_avg, (long)(load_avg - se->avg.load_avg));
> 
> you have to keep:
> delta_avg = load_avg - se->avg.load_avg
> or move se->avg.load_avg = load_avg after
> add_positive(&cfs_rq->avg.load_avg, ..);
> because otherwise (load_avg - se->avg.load_avg) == 0

Duh. /me goes fix.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
  2021-05-27 12:29   ` Vincent Guittot
@ 2021-05-27 13:40     ` kernel test robot
  -1 siblings, 0 replies; 20+ messages in thread
From: kernel test robot @ 2021-05-27 13:40 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, linux-kernel, odin
  Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6740 bytes --]

Hi Vincent,

I love your patch! Perhaps something to improve:

[auto build test WARNING on tip/sched/core]
[also build test WARNING on tip/master linux/master linus/master v5.13-rc3 next-20210527]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Vincent-Guittot/schd-fair-fix-stalled-cfs_rq-tg_load_avg_contrib/20210527-203115
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git a8ea6fc9b089156d9230bfeef964dd9be101a4a9
config: s390-randconfig-c024-20210527 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/725167b3ef5a796c98add27cb21e543c7b72cf0e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Vincent-Guittot/schd-fair-fix-stalled-cfs_rq-tg_load_avg_contrib/20210527-203115
        git checkout 725167b3ef5a796c98add27cb21e543c7b72cf0e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   kernel/sched/fair.c: In function 'update_tg_cfs_load':
>> kernel/sched/fair.c:3459:6: warning: variable 'delta_sum' set but not used [-Wunused-but-set-variable]
    3459 |  s64 delta_sum;
         |      ^~~~~~~~~


vim +/delta_sum +3459 kernel/sched/fair.c

9f68395333ad7f Vincent Guittot 2020-02-24  3452  
09a43ace1f986b Vincent Guittot 2016-11-08  3453  static inline void
0dacee1bfa70e1 Vincent Guittot 2020-02-24  3454  update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
09a43ace1f986b Vincent Guittot 2016-11-08  3455  {
a4c3c04974d648 Vincent Guittot 2017-11-16  3456  	long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
0dacee1bfa70e1 Vincent Guittot 2020-02-24  3457  	unsigned long load_avg;
0dacee1bfa70e1 Vincent Guittot 2020-02-24  3458  	u64 load_sum = 0;
a4c3c04974d648 Vincent Guittot 2017-11-16 @3459  	s64 delta_sum;
95d685935a2edf Vincent Guittot 2020-05-06  3460  	u32 divider;
09a43ace1f986b Vincent Guittot 2016-11-08  3461  
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3462  	if (!runnable_sum)
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3463  		return;
09a43ace1f986b Vincent Guittot 2016-11-08  3464  
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3465  	gcfs_rq->prop_runnable_sum = 0;
09a43ace1f986b Vincent Guittot 2016-11-08  3466  
95d685935a2edf Vincent Guittot 2020-05-06  3467  	/*
95d685935a2edf Vincent Guittot 2020-05-06  3468  	 * cfs_rq->avg.period_contrib can be used for both cfs_rq and se.
95d685935a2edf Vincent Guittot 2020-05-06  3469  	 * See ___update_load_avg() for details.
95d685935a2edf Vincent Guittot 2020-05-06  3470  	 */
87e867b4269f29 Vincent Guittot 2020-06-12  3471  	divider = get_pelt_divider(&cfs_rq->avg);
95d685935a2edf Vincent Guittot 2020-05-06  3472  
a4c3c04974d648 Vincent Guittot 2017-11-16  3473  	if (runnable_sum >= 0) {
a4c3c04974d648 Vincent Guittot 2017-11-16  3474  		/*
a4c3c04974d648 Vincent Guittot 2017-11-16  3475  		 * Add runnable; clip at LOAD_AVG_MAX. Reflects that until
a4c3c04974d648 Vincent Guittot 2017-11-16  3476  		 * the CPU is saturated running == runnable.
a4c3c04974d648 Vincent Guittot 2017-11-16  3477  		 */
a4c3c04974d648 Vincent Guittot 2017-11-16  3478  		runnable_sum += se->avg.load_sum;
95d685935a2edf Vincent Guittot 2020-05-06  3479  		runnable_sum = min_t(long, runnable_sum, divider);
a4c3c04974d648 Vincent Guittot 2017-11-16  3480  	} else {
a4c3c04974d648 Vincent Guittot 2017-11-16  3481  		/*
a4c3c04974d648 Vincent Guittot 2017-11-16  3482  		 * Estimate the new unweighted runnable_sum of the gcfs_rq by
a4c3c04974d648 Vincent Guittot 2017-11-16  3483  		 * assuming all tasks are equally runnable.
a4c3c04974d648 Vincent Guittot 2017-11-16  3484  		 */
a4c3c04974d648 Vincent Guittot 2017-11-16  3485  		if (scale_load_down(gcfs_rq->load.weight)) {
a4c3c04974d648 Vincent Guittot 2017-11-16  3486  			load_sum = div_s64(gcfs_rq->avg.load_sum,
a4c3c04974d648 Vincent Guittot 2017-11-16  3487  				scale_load_down(gcfs_rq->load.weight));
a4c3c04974d648 Vincent Guittot 2017-11-16  3488  		}
a4c3c04974d648 Vincent Guittot 2017-11-16  3489  
a4c3c04974d648 Vincent Guittot 2017-11-16  3490  		/* But make sure to not inflate se's runnable */
a4c3c04974d648 Vincent Guittot 2017-11-16  3491  		runnable_sum = min(se->avg.load_sum, load_sum);
a4c3c04974d648 Vincent Guittot 2017-11-16  3492  	}
a4c3c04974d648 Vincent Guittot 2017-11-16  3493  
a4c3c04974d648 Vincent Guittot 2017-11-16  3494  	/*
a4c3c04974d648 Vincent Guittot 2017-11-16  3495  	 * runnable_sum can't be lower than running_sum
23127296889fe8 Vincent Guittot 2019-01-23  3496  	 * Rescale running sum to be in the same range as runnable sum
23127296889fe8 Vincent Guittot 2019-01-23  3497  	 * running_sum is in [0 : LOAD_AVG_MAX <<  SCHED_CAPACITY_SHIFT]
23127296889fe8 Vincent Guittot 2019-01-23  3498  	 * runnable_sum is in [0 : LOAD_AVG_MAX]
a4c3c04974d648 Vincent Guittot 2017-11-16  3499  	 */
23127296889fe8 Vincent Guittot 2019-01-23  3500  	running_sum = se->avg.util_sum >> SCHED_CAPACITY_SHIFT;
a4c3c04974d648 Vincent Guittot 2017-11-16  3501  	runnable_sum = max(runnable_sum, running_sum);
a4c3c04974d648 Vincent Guittot 2017-11-16  3502  
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3503  	load_sum = (s64)se_weight(se) * runnable_sum;
95d685935a2edf Vincent Guittot 2020-05-06  3504  	load_avg = div_s64(load_sum, divider);
09a43ace1f986b Vincent Guittot 2016-11-08  3505  
a4c3c04974d648 Vincent Guittot 2017-11-16  3506  	delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
a4c3c04974d648 Vincent Guittot 2017-11-16  3507  	delta_avg = load_avg - se->avg.load_avg;
09a43ace1f986b Vincent Guittot 2016-11-08  3508  
a4c3c04974d648 Vincent Guittot 2017-11-16  3509  	se->avg.load_sum = runnable_sum;
a4c3c04974d648 Vincent Guittot 2017-11-16  3510  	se->avg.load_avg = load_avg;
a4c3c04974d648 Vincent Guittot 2017-11-16  3511  	add_positive(&cfs_rq->avg.load_avg, delta_avg);
725167b3ef5a79 Vincent Guittot 2021-05-27  3512  	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
725167b3ef5a79 Vincent Guittot 2021-05-27  3513  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 16483 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] sched/fair: keep load_avg and load_sum synced
@ 2021-05-27 13:40     ` kernel test robot
  0 siblings, 0 replies; 20+ messages in thread
From: kernel test robot @ 2021-05-27 13:40 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6845 bytes --]

Hi Vincent,

I love your patch! Perhaps something to improve:

[auto build test WARNING on tip/sched/core]
[also build test WARNING on tip/master linux/master linus/master v5.13-rc3 next-20210527]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Vincent-Guittot/schd-fair-fix-stalled-cfs_rq-tg_load_avg_contrib/20210527-203115
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git a8ea6fc9b089156d9230bfeef964dd9be101a4a9
config: s390-randconfig-c024-20210527 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/725167b3ef5a796c98add27cb21e543c7b72cf0e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Vincent-Guittot/schd-fair-fix-stalled-cfs_rq-tg_load_avg_contrib/20210527-203115
        git checkout 725167b3ef5a796c98add27cb21e543c7b72cf0e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   kernel/sched/fair.c: In function 'update_tg_cfs_load':
>> kernel/sched/fair.c:3459:6: warning: variable 'delta_sum' set but not used [-Wunused-but-set-variable]
    3459 |  s64 delta_sum;
         |      ^~~~~~~~~


vim +/delta_sum +3459 kernel/sched/fair.c

9f68395333ad7f Vincent Guittot 2020-02-24  3452  
09a43ace1f986b Vincent Guittot 2016-11-08  3453  static inline void
0dacee1bfa70e1 Vincent Guittot 2020-02-24  3454  update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
09a43ace1f986b Vincent Guittot 2016-11-08  3455  {
a4c3c04974d648 Vincent Guittot 2017-11-16  3456  	long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
0dacee1bfa70e1 Vincent Guittot 2020-02-24  3457  	unsigned long load_avg;
0dacee1bfa70e1 Vincent Guittot 2020-02-24  3458  	u64 load_sum = 0;
a4c3c04974d648 Vincent Guittot 2017-11-16 @3459  	s64 delta_sum;
95d685935a2edf Vincent Guittot 2020-05-06  3460  	u32 divider;
09a43ace1f986b Vincent Guittot 2016-11-08  3461  
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3462  	if (!runnable_sum)
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3463  		return;
09a43ace1f986b Vincent Guittot 2016-11-08  3464  
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3465  	gcfs_rq->prop_runnable_sum = 0;
09a43ace1f986b Vincent Guittot 2016-11-08  3466  
95d685935a2edf Vincent Guittot 2020-05-06  3467  	/*
95d685935a2edf Vincent Guittot 2020-05-06  3468  	 * cfs_rq->avg.period_contrib can be used for both cfs_rq and se.
95d685935a2edf Vincent Guittot 2020-05-06  3469  	 * See ___update_load_avg() for details.
95d685935a2edf Vincent Guittot 2020-05-06  3470  	 */
87e867b4269f29 Vincent Guittot 2020-06-12  3471  	divider = get_pelt_divider(&cfs_rq->avg);
95d685935a2edf Vincent Guittot 2020-05-06  3472  
a4c3c04974d648 Vincent Guittot 2017-11-16  3473  	if (runnable_sum >= 0) {
a4c3c04974d648 Vincent Guittot 2017-11-16  3474  		/*
a4c3c04974d648 Vincent Guittot 2017-11-16  3475  		 * Add runnable; clip at LOAD_AVG_MAX. Reflects that until
a4c3c04974d648 Vincent Guittot 2017-11-16  3476  		 * the CPU is saturated running == runnable.
a4c3c04974d648 Vincent Guittot 2017-11-16  3477  		 */
a4c3c04974d648 Vincent Guittot 2017-11-16  3478  		runnable_sum += se->avg.load_sum;
95d685935a2edf Vincent Guittot 2020-05-06  3479  		runnable_sum = min_t(long, runnable_sum, divider);
a4c3c04974d648 Vincent Guittot 2017-11-16  3480  	} else {
a4c3c04974d648 Vincent Guittot 2017-11-16  3481  		/*
a4c3c04974d648 Vincent Guittot 2017-11-16  3482  		 * Estimate the new unweighted runnable_sum of the gcfs_rq by
a4c3c04974d648 Vincent Guittot 2017-11-16  3483  		 * assuming all tasks are equally runnable.
a4c3c04974d648 Vincent Guittot 2017-11-16  3484  		 */
a4c3c04974d648 Vincent Guittot 2017-11-16  3485  		if (scale_load_down(gcfs_rq->load.weight)) {
a4c3c04974d648 Vincent Guittot 2017-11-16  3486  			load_sum = div_s64(gcfs_rq->avg.load_sum,
a4c3c04974d648 Vincent Guittot 2017-11-16  3487  				scale_load_down(gcfs_rq->load.weight));
a4c3c04974d648 Vincent Guittot 2017-11-16  3488  		}
a4c3c04974d648 Vincent Guittot 2017-11-16  3489  
a4c3c04974d648 Vincent Guittot 2017-11-16  3490  		/* But make sure to not inflate se's runnable */
a4c3c04974d648 Vincent Guittot 2017-11-16  3491  		runnable_sum = min(se->avg.load_sum, load_sum);
a4c3c04974d648 Vincent Guittot 2017-11-16  3492  	}
a4c3c04974d648 Vincent Guittot 2017-11-16  3493  
a4c3c04974d648 Vincent Guittot 2017-11-16  3494  	/*
a4c3c04974d648 Vincent Guittot 2017-11-16  3495  	 * runnable_sum can't be lower than running_sum
23127296889fe8 Vincent Guittot 2019-01-23  3496  	 * Rescale running sum to be in the same range as runnable sum
23127296889fe8 Vincent Guittot 2019-01-23  3497  	 * running_sum is in [0 : LOAD_AVG_MAX <<  SCHED_CAPACITY_SHIFT]
23127296889fe8 Vincent Guittot 2019-01-23  3498  	 * runnable_sum is in [0 : LOAD_AVG_MAX]
a4c3c04974d648 Vincent Guittot 2017-11-16  3499  	 */
23127296889fe8 Vincent Guittot 2019-01-23  3500  	running_sum = se->avg.util_sum >> SCHED_CAPACITY_SHIFT;
a4c3c04974d648 Vincent Guittot 2017-11-16  3501  	runnable_sum = max(runnable_sum, running_sum);
a4c3c04974d648 Vincent Guittot 2017-11-16  3502  
0e2d2aaaae52c2 Peter Zijlstra  2017-05-08  3503  	load_sum = (s64)se_weight(se) * runnable_sum;
95d685935a2edf Vincent Guittot 2020-05-06  3504  	load_avg = div_s64(load_sum, divider);
09a43ace1f986b Vincent Guittot 2016-11-08  3505  
a4c3c04974d648 Vincent Guittot 2017-11-16  3506  	delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
a4c3c04974d648 Vincent Guittot 2017-11-16  3507  	delta_avg = load_avg - se->avg.load_avg;
09a43ace1f986b Vincent Guittot 2016-11-08  3508  
a4c3c04974d648 Vincent Guittot 2017-11-16  3509  	se->avg.load_sum = runnable_sum;
a4c3c04974d648 Vincent Guittot 2017-11-16  3510  	se->avg.load_avg = load_avg;
a4c3c04974d648 Vincent Guittot 2017-11-16  3511  	add_positive(&cfs_rq->avg.load_avg, delta_avg);
725167b3ef5a79 Vincent Guittot 2021-05-27  3512  	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
725167b3ef5a79 Vincent Guittot 2021-05-27  3513  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 16483 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib
@ 2021-05-27 14:19   ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 14:19 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Odin Ugedal,
	open list:CONTROL GROUP (CGROUP)

Hi Odin,

On Thu, 27 May 2021 at 14:29, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> Odin reported some fairness problem between cgroup because of stalled
> value in cfs_rq->tg_load_avg_contrib:
>
> https://lkml.org/lkml/2021/5/18/566
>
>
> 2 problems generated this situation:
> -1st: After propagating load in the hierarchy, load_sum can be null
>  whereas load_avg isn't so the cfs_rq is removed whereas it still
>  contribute to th tg's load
> -2nd: cfs_rq->tg_load_avg_contrib was not always updated after
>  significant changes like becoming null because cfs_rq had already
>  been updated when propagating a child load.
>

This series fixes the problem triggered by your 1st script on my test
machine. But could you confirm that this patchset also fixes the
problem on yours

Regards,
Vincent

>
> Vincent Guittot (2):
>   sched/fair: keep load_avg and load_sum synced
>   sched/fair: make sure to update tg contrib for blocked load
>
>  kernel/sched/fair.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib
@ 2021-05-27 14:19   ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2021-05-27 14:19 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Odin Ugedal,
	open list:CONTROL GROUP (CGROUP)

Hi Odin,

On Thu, 27 May 2021 at 14:29, Vincent Guittot
<vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>
> Odin reported some fairness problem between cgroup because of stalled
> value in cfs_rq->tg_load_avg_contrib:
>
> https://lkml.org/lkml/2021/5/18/566
>
>
> 2 problems generated this situation:
> -1st: After propagating load in the hierarchy, load_sum can be null
>  whereas load_avg isn't so the cfs_rq is removed whereas it still
>  contribute to th tg's load
> -2nd: cfs_rq->tg_load_avg_contrib was not always updated after
>  significant changes like becoming null because cfs_rq had already
>  been updated when propagating a child load.
>

This series fixes the problem triggered by your 1st script on my test
machine. But could you confirm that this patchset also fixes the
problem on yours

Regards,
Vincent

>
> Vincent Guittot (2):
>   sched/fair: keep load_avg and load_sum synced
>   sched/fair: make sure to update tg contrib for blocked load
>
>  kernel/sched/fair.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib
@ 2021-05-27 14:24     ` Odin Ugedal
  0 siblings, 0 replies; 20+ messages in thread
From: Odin Ugedal @ 2021-05-27 14:24 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Odin Ugedal,
	open list:CONTROL GROUP (CGROUP)

Hi,

Yes, this does the trick!

I have tested this locally and it works as expected, as discussed in
the previous thread (more info there). Together with the patch "[PATCH
2/3] sched/fair: Correctly insert cfs_rq's to list on unthrottle", I
am unable to reproduce the issue locally; so this is good to go.

Feel free to add this to both patches when the warnings on the first
patch are fixed;

Reviewed-by: Odin Ugedal <odin@uged.al>

Thanks
Odin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib
@ 2021-05-27 14:24     ` Odin Ugedal
  0 siblings, 0 replies; 20+ messages in thread
From: Odin Ugedal @ 2021-05-27 14:24 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Odin Ugedal,
	open list:CONTROL GROUP (CGROUP)

Hi,

Yes, this does the trick!

I have tested this locally and it works as expected, as discussed in
the previous thread (more info there). Together with the patch "[PATCH
2/3] sched/fair: Correctly insert cfs_rq's to list on unthrottle", I
am unable to reproduce the issue locally; so this is good to go.

Feel free to add this to both patches when the warnings on the first
patch are fixed;

Reviewed-by: Odin Ugedal <odin-RObV4cXtwVA@public.gmane.org>

Thanks
Odin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip: sched/urgent] sched/fair: Make sure to update tg contrib for blocked load
  2021-05-27 12:29   ` Vincent Guittot
  (?)
@ 2021-05-31 10:40   ` tip-bot2 for Vincent Guittot
  -1 siblings, 0 replies; 20+ messages in thread
From: tip-bot2 for Vincent Guittot @ 2021-05-31 10:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Odin Ugedal, Vincent Guittot, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     02da26ad5ed6ea8680e5d01f20661439611ed776
Gitweb:        https://git.kernel.org/tip/02da26ad5ed6ea8680e5d01f20661439611ed776
Author:        Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate:    Thu, 27 May 2021 14:29:16 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 31 May 2021 10:14:48 +02:00

sched/fair: Make sure to update tg contrib for blocked load

During the update of fair blocked load (__update_blocked_fair()), we
update the contribution of the cfs in tg->load_avg if cfs_rq's pelt
has decayed.  Nevertheless, the pelt values of a cfs_rq could have
been recently updated while propagating the change of a child. In this
case, cfs_rq's pelt will not decayed because it has already been
updated and we don't update tg->load_avg.

__update_blocked_fair
  ...
  for_each_leaf_cfs_rq_safe: child cfs_rq
    update cfs_rq_load_avg() for child cfs_rq
    ...
    update_load_avg(cfs_rq_of(se), se, 0)
      ...
      update cfs_rq_load_avg() for parent cfs_rq
		-propagation of child's load makes parent cfs_rq->load_sum
		 becoming null
        -UPDATE_TG is not set so it doesn't update parent
		 cfs_rq->tg_load_avg_contrib
  ..
  for_each_leaf_cfs_rq_safe: parent cfs_rq
    update cfs_rq_load_avg() for parent cfs_rq
      - nothing to do because parent cfs_rq has already been updated
		recently so cfs_rq->tg_load_avg_contrib is not updated
    ...
    parent cfs_rq is decayed
      list_del_leaf_cfs_rq parent cfs_rq
	  - but it still contibutes to tg->load_avg

we must set UPDATE_TG flags when propagting pending load to the parent

Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lkml.kernel.org/r/20210527122916.27683-3-vincent.guittot@linaro.org
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f4795b8..e7c8277 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8029,7 +8029,7 @@ static bool __update_blocked_fair(struct rq *rq, bool *done)
 		/* Propagate pending load changes to the parent, if any: */
 		se = cfs_rq->tg->se[cpu];
 		if (se && !skip_blocked_update(se))
-			update_load_avg(cfs_rq_of(se), se, 0);
+			update_load_avg(cfs_rq_of(se), se, UPDATE_TG);
 
 		/*
 		 * There can be a lot of idle CPU cgroups.  Don't let fully

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip: sched/urgent] sched/fair: Keep load_avg and load_sum synced
  2021-05-27 12:29   ` Vincent Guittot
                     ` (2 preceding siblings ...)
  (?)
@ 2021-05-31 10:40   ` tip-bot2 for Vincent Guittot
  -1 siblings, 0 replies; 20+ messages in thread
From: tip-bot2 for Vincent Guittot @ 2021-05-31 10:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Odin Ugedal, Vincent Guittot, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     7c7ad626d9a0ff0a36c1e2a3cfbbc6a13828d5eb
Gitweb:        https://git.kernel.org/tip/7c7ad626d9a0ff0a36c1e2a3cfbbc6a13828d5eb
Author:        Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate:    Thu, 27 May 2021 14:29:15 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 31 May 2021 10:14:48 +02:00

sched/fair: Keep load_avg and load_sum synced

when removing a cfs_rq from the list we only check _sum value so we must
ensure that _avg and _sum stay synced so load_sum can't be null whereas
load_avg is not after propagating load in the cgroup hierarchy.

Use load_avg to compute load_sum similarly to what is done for util_sum
and runnable_sum.

Fixes: 0e2d2aaaae52 ("sched/fair: Rewrite PELT migration propagation")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lkml.kernel.org/r/20210527122916.27683-2-vincent.guittot@linaro.org
---
 kernel/sched/fair.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3248e24..f4795b8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3499,10 +3499,9 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf
 static inline void
 update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
 {
-	long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
+	long delta, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum;
 	unsigned long load_avg;
 	u64 load_sum = 0;
-	s64 delta_sum;
 	u32 divider;
 
 	if (!runnable_sum)
@@ -3549,13 +3548,13 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq
 	load_sum = (s64)se_weight(se) * runnable_sum;
 	load_avg = div_s64(load_sum, divider);
 
-	delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum;
-	delta_avg = load_avg - se->avg.load_avg;
+	delta = load_avg - se->avg.load_avg;
 
 	se->avg.load_sum = runnable_sum;
 	se->avg.load_avg = load_avg;
-	add_positive(&cfs_rq->avg.load_avg, delta_avg);
-	add_positive(&cfs_rq->avg.load_sum, delta_sum);
+
+	add_positive(&cfs_rq->avg.load_avg, delta);
+	cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider;
 }
 
 static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-05-31 10:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-27 12:29 [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib Vincent Guittot
2021-05-27 12:29 ` Vincent Guittot
2021-05-27 12:29 ` [PATCH 1/2] sched/fair: keep load_avg and load_sum synced Vincent Guittot
2021-05-27 12:29   ` Vincent Guittot
2021-05-27 13:05   ` Peter Zijlstra
2021-05-27 13:05     ` Peter Zijlstra
2021-05-27 13:18     ` Vincent Guittot
2021-05-27 13:18       ` Vincent Guittot
2021-05-27 13:37       ` Peter Zijlstra
2021-05-27 13:37         ` Peter Zijlstra
2021-05-27 13:40   ` kernel test robot
2021-05-27 13:40     ` kernel test robot
2021-05-31 10:40   ` [tip: sched/urgent] sched/fair: Keep " tip-bot2 for Vincent Guittot
2021-05-27 12:29 ` [PATCH 2/2] sched/fair: make sure to update tg contrib for blocked load Vincent Guittot
2021-05-27 12:29   ` Vincent Guittot
2021-05-31 10:40   ` [tip: sched/urgent] sched/fair: Make " tip-bot2 for Vincent Guittot
2021-05-27 14:19 ` [PATCH 0/2] schd/fair: fix stalled cfs_rq->tg_load_avg_contrib Vincent Guittot
2021-05-27 14:19   ` Vincent Guittot
2021-05-27 14:24   ` Odin Ugedal
2021-05-27 14:24     ` Odin Ugedal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.