linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
@ 2020-04-22 15:14 Vincent Guittot
  2020-04-23 14:30 ` Dietmar Eggemann
  2020-04-23 19:29 ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Vincent Guittot @ 2020-04-22 15:14 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel
  Cc: Vincent Guittot

update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
an entity down into the cfs_rq hierarchy which must keep the sync with
the current pelt window.

Even if we can't sync child rq and its group se, we can sync the group se
and parent cfs_rq with current PELT window. In fact, we must keep them sync
in order to stay also synced with others se and group se that are already
attached to the cfs_rq.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 26 ++++++--------------------
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 02f323b85b6d..ca6aa89c88f2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3441,52 +3441,38 @@ static inline void
 update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
 {
 	long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
+	u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
 
 	/* Nothing to update */
 	if (!delta)
 		return;
 
-	/*
-	 * The relation between sum and avg is:
-	 *
-	 *   LOAD_AVG_MAX - 1024 + sa->period_contrib
-	 *
-	 * however, the PELT windows are not aligned between grq and gse.
-	 */
-
 	/* Set new sched_entity's utilization */
 	se->avg.util_avg = gcfs_rq->avg.util_avg;
-	se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
+	se->avg.util_sum = se->avg.util_avg * divider;
 
 	/* Update parent cfs_rq utilization */
 	add_positive(&cfs_rq->avg.util_avg, delta);
-	cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
+	cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;
 }
 
 static inline void
 update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
 {
 	long delta = gcfs_rq->avg.runnable_avg - se->avg.runnable_avg;
+	u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
 
 	/* Nothing to update */
 	if (!delta)
 		return;
 
-	/*
-	 * The relation between sum and avg is:
-	 *
-	 *   LOAD_AVG_MAX - 1024 + sa->period_contrib
-	 *
-	 * however, the PELT windows are not aligned between grq and gse.
-	 */
-
 	/* Set new sched_entity's runnable */
 	se->avg.runnable_avg = gcfs_rq->avg.runnable_avg;
-	se->avg.runnable_sum = se->avg.runnable_avg * LOAD_AVG_MAX;
+	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
 	/* Update parent cfs_rq runnable */
 	add_positive(&cfs_rq->avg.runnable_avg, delta);
-	cfs_rq->avg.runnable_sum = cfs_rq->avg.runnable_avg * LOAD_AVG_MAX;
+	cfs_rq->avg.runnable_sum = cfs_rq->avg.runnable_avg * divider;
 }
 
 static inline void
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-22 15:14 [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating Vincent Guittot
@ 2020-04-23 14:30 ` Dietmar Eggemann
  2020-04-23 16:17   ` Vincent Guittot
  2020-04-23 19:29 ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2020-04-23 14:30 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, rostedt, bsegall,
	mgorman, linux-kernel



On 22/04/2020 17:14, Vincent Guittot wrote:
> update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> an entity down into the cfs_rq hierarchy which must keep the sync with
> the current pelt window.
> 
> Even if we can't sync child rq and its group se, we can sync the group se

So we have

    gcfs --> tg --> gse
     ________________|
     |
     V

    cfs ---> tg (root)

     |
     V

    rq


here. What is 'child rq' for 'group se' here? I guess 'parent cfs_rq' is
cfs_rq.

> and parent cfs_rq with current PELT window. In fact, we must keep them sync
> in order to stay also synced with others se and group se that are already
> attached to the cfs_rq.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 26 ++++++--------------------
>  1 file changed, 6 insertions(+), 20 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 02f323b85b6d..ca6aa89c88f2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3441,52 +3441,38 @@ static inline void
>  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
>  {
>  	long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> +	u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>  
>  	/* Nothing to update */
>  	if (!delta)
>  		return;
>  
> -	/*
> -	 * The relation between sum and avg is:
> -	 *
> -	 *   LOAD_AVG_MAX - 1024 + sa->period_contrib
> -	 *
> -	 * however, the PELT windows are not aligned between grq and gse.
> -	 */
> -
>  	/* Set new sched_entity's utilization */
>  	se->avg.util_avg = gcfs_rq->avg.util_avg;
> -	se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> +	se->avg.util_sum = se->avg.util_avg * divider;

divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.

But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
should work.

>  	/* Update parent cfs_rq utilization */
>  	add_positive(&cfs_rq->avg.util_avg, delta);
> -	cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> +	cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;

Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
and cfs_rq is always the same in update_tg_cfs_[util\|runnable].

So that means the PELT windows are aligned for cfs_rqs and group se's?

And if we want to enforce this for cfs_rq and task, we have
sync_entity_load_avg().

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-23 14:30 ` Dietmar Eggemann
@ 2020-04-23 16:17   ` Vincent Guittot
  2020-04-24 12:07     ` Dietmar Eggemann
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Guittot @ 2020-04-23 16:17 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel

On Thu, 23 Apr 2020 at 16:30, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
>
>
> On 22/04/2020 17:14, Vincent Guittot wrote:
> > update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> > an entity down into the cfs_rq hierarchy which must keep the sync with
> > the current pelt window.
> >
> > Even if we can't sync child rq and its group se, we can sync the group se
>
> So we have
>
>     gcfs --> tg --> gse
>      ________________|
>      |
>      V
>
>     cfs ---> tg (root)
>
>      |
>      V
>
>     rq
>

child cfs_rq aka gcfs_rq
  |
gse: group entity that represents child cfs_rq in parent cfs_rq
  |
  v
parent cfs_rq aka cfs_rq

>
> here. What is 'child rq' for 'group se' here? I guess 'parent cfs_rq' is
> cfs_rq.
>
> > and parent cfs_rq with current PELT window. In fact, we must keep them sync
> > in order to stay also synced with others se and group se that are already
> > attached to the cfs_rq.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 26 ++++++--------------------
> >  1 file changed, 6 insertions(+), 20 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 02f323b85b6d..ca6aa89c88f2 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3441,52 +3441,38 @@ static inline void
> >  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> >  {
> >       long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> > +     u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
> >
> >       /* Nothing to update */
> >       if (!delta)
> >               return;
> >
> > -     /*
> > -      * The relation between sum and avg is:
> > -      *
> > -      *   LOAD_AVG_MAX - 1024 + sa->period_contrib
> > -      *
> > -      * however, the PELT windows are not aligned between grq and gse.
> > -      */
> > -
> >       /* Set new sched_entity's utilization */
> >       se->avg.util_avg = gcfs_rq->avg.util_avg;
> > -     se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> > +     se->avg.util_sum = se->avg.util_avg * divider;
>
> divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.

we sync the util_avg of gse with the new util_avg of gcfs_rq but gse
is attached to cfs_rq and as a result we have to use cfs_rq's
period_contrib

>
> But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
> should work.
>
> >       /* Update parent cfs_rq utilization */
> >       add_positive(&cfs_rq->avg.util_avg, delta);
> > -     cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> > +     cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;
>
> Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
> and cfs_rq is always the same in update_tg_cfs_[util\|runnable].
>
> So that means the PELT windows are aligned for cfs_rqs and group se's?

We want to align util_avg with util_sum and period_contrib otherwise
we might have some unalignment. It's quite similarly to what is done
in attach_entity_load_avg()

>
> And if we want to enforce this for cfs_rq and task, we have
> sync_entity_load_avg().

It's not a matter of syncing the last_update_time

>
> [...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-22 15:14 [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating Vincent Guittot
  2020-04-23 14:30 ` Dietmar Eggemann
@ 2020-04-23 19:29 ` Peter Zijlstra
  2020-04-24  7:37   ` Vincent Guittot
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2020-04-23 19:29 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	linux-kernel

On Wed, Apr 22, 2020 at 05:14:01PM +0200, Vincent Guittot wrote:
> update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> an entity down into the cfs_rq hierarchy which must keep the sync with
> the current pelt window.
> 
> Even if we can't sync child rq and its group se, we can sync the group se
> and parent cfs_rq with current PELT window. In fact, we must keep them sync
> in order to stay also synced with others se and group se that are already
> attached to the cfs_rq.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 26 ++++++--------------------
>  1 file changed, 6 insertions(+), 20 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 02f323b85b6d..ca6aa89c88f2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3441,52 +3441,38 @@ static inline void
>  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
>  {
>  	long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> +	u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>  
>  	/* Nothing to update */
>  	if (!delta)
>  		return;
>  
> -	/*
> -	 * The relation between sum and avg is:
> -	 *
> -	 *   LOAD_AVG_MAX - 1024 + sa->period_contrib
> -	 *
> -	 * however, the PELT windows are not aligned between grq and gse.
> -	 */

Instead of deleting this, could we perhaps extend it?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-23 19:29 ` Peter Zijlstra
@ 2020-04-24  7:37   ` Vincent Guittot
  2020-04-24  8:41     ` Dietmar Eggemann
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Guittot @ 2020-04-24  7:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel

On Thu, 23 Apr 2020 at 21:29, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Apr 22, 2020 at 05:14:01PM +0200, Vincent Guittot wrote:
> > update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> > an entity down into the cfs_rq hierarchy which must keep the sync with
> > the current pelt window.
> >
> > Even if we can't sync child rq and its group se, we can sync the group se
> > and parent cfs_rq with current PELT window. In fact, we must keep them sync
> > in order to stay also synced with others se and group se that are already
> > attached to the cfs_rq.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 26 ++++++--------------------
> >  1 file changed, 6 insertions(+), 20 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 02f323b85b6d..ca6aa89c88f2 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3441,52 +3441,38 @@ static inline void
> >  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> >  {
> >       long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> > +     u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
> >
> >       /* Nothing to update */
> >       if (!delta)
> >               return;
> >
> > -     /*
> > -      * The relation between sum and avg is:
> > -      *
> > -      *   LOAD_AVG_MAX - 1024 + sa->period_contrib
> > -      *
> > -      * however, the PELT windows are not aligned between grq and gse.
> > -      */
>
> Instead of deleting this, could we perhaps extend it?

In fact, this is not the only place in fair.c that uses this rule to
align _avg and _sum but others don't have any special comment.

I can add a more detailed description of this relation for
___update_load_avg() in pelt.c and make a ref to this in all places in
fair.c that use this rule which are :
- update_tg_cfs_util
- update_tg_cfs_runnable
- update_cfs_rq_load_avg
- attach_entity_load_avg
- reweight_entity

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-24  7:37   ` Vincent Guittot
@ 2020-04-24  8:41     ` Dietmar Eggemann
  2020-04-24  8:54       ` Vincent Guittot
  0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2020-04-24  8:41 UTC (permalink / raw)
  To: Vincent Guittot, Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Steven Rostedt, Ben Segall, Mel Gorman,
	linux-kernel

On 24/04/2020 09:37, Vincent Guittot wrote:
> On Thu, 23 Apr 2020 at 21:29, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Wed, Apr 22, 2020 at 05:14:01PM +0200, Vincent Guittot wrote:
>>> update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
>>> an entity down into the cfs_rq hierarchy which must keep the sync with
>>> the current pelt window.
>>>
>>> Even if we can't sync child rq and its group se, we can sync the group se
>>> and parent cfs_rq with current PELT window. In fact, we must keep them sync
>>> in order to stay also synced with others se and group se that are already
>>> attached to the cfs_rq.
>>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>>  kernel/sched/fair.c | 26 ++++++--------------------
>>>  1 file changed, 6 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 02f323b85b6d..ca6aa89c88f2 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -3441,52 +3441,38 @@ static inline void
>>>  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
>>>  {
>>>       long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
>>> +     u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>>>
>>>       /* Nothing to update */
>>>       if (!delta)
>>>               return;
>>>
>>> -     /*
>>> -      * The relation between sum and avg is:
>>> -      *
>>> -      *   LOAD_AVG_MAX - 1024 + sa->period_contrib
>>> -      *
>>> -      * however, the PELT windows are not aligned between grq and gse.
>>> -      */
>>
>> Instead of deleting this, could we perhaps extend it?
> 
> In fact, this is not the only place in fair.c that uses this rule to
> align _avg and _sum but others don't have any special comment.
> 
> I can add a more detailed description of this relation for
> ___update_load_avg() in pelt.c and make a ref to this in all places in
> fair.c that use this rule which are :
> - update_tg_cfs_util
> - update_tg_cfs_runnable
> - update_cfs_rq_load_avg
> - attach_entity_load_avg
> - reweight_entity

But IMHO the

"* however, the PELT windows are not aligned between grq and gse."

should only apply to update_tg_cfs_util() and update_tg_cfs_runnable().
And attach_entity_load_avg() (for cfs_rq and se).

They seem to be special since we derive divider from a cfs_rq PELT value
and use it for a se PELT value.

I assume this fact is specifically worth highlighting with a comment. I
mean the fact we can do this because the decay windows are actually aligned.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-24  8:41     ` Dietmar Eggemann
@ 2020-04-24  8:54       ` Vincent Guittot
  2020-04-24  9:06         ` Dietmar Eggemann
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Guittot @ 2020-04-24  8:54 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel

On Fri, 24 Apr 2020 at 10:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 24/04/2020 09:37, Vincent Guittot wrote:
> > On Thu, 23 Apr 2020 at 21:29, Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Wed, Apr 22, 2020 at 05:14:01PM +0200, Vincent Guittot wrote:
> >>> update_tg_cfs_util/runnable() propagate the impact of the attach/detach of
> >>> an entity down into the cfs_rq hierarchy which must keep the sync with
> >>> the current pelt window.
> >>>
> >>> Even if we can't sync child rq and its group se, we can sync the group se
> >>> and parent cfs_rq with current PELT window. In fact, we must keep them sync
> >>> in order to stay also synced with others se and group se that are already
> >>> attached to the cfs_rq.
> >>>
> >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> >>> ---
> >>>  kernel/sched/fair.c | 26 ++++++--------------------
> >>>  1 file changed, 6 insertions(+), 20 deletions(-)
> >>>
> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> index 02f323b85b6d..ca6aa89c88f2 100644
> >>> --- a/kernel/sched/fair.c
> >>> +++ b/kernel/sched/fair.c
> >>> @@ -3441,52 +3441,38 @@ static inline void
> >>>  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
> >>>  {
> >>>       long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
> >>> +     u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
> >>>
> >>>       /* Nothing to update */
> >>>       if (!delta)
> >>>               return;
> >>>
> >>> -     /*
> >>> -      * The relation between sum and avg is:
> >>> -      *
> >>> -      *   LOAD_AVG_MAX - 1024 + sa->period_contrib
> >>> -      *
> >>> -      * however, the PELT windows are not aligned between grq and gse.
> >>> -      */
> >>
> >> Instead of deleting this, could we perhaps extend it?
> >
> > In fact, this is not the only place in fair.c that uses this rule to
> > align _avg and _sum but others don't have any special comment.
> >
> > I can add a more detailed description of this relation for
> > ___update_load_avg() in pelt.c and make a ref to this in all places in
> > fair.c that use this rule which are :
> > - update_tg_cfs_util
> > - update_tg_cfs_runnable
> > - update_cfs_rq_load_avg
> > - attach_entity_load_avg
> > - reweight_entity
>
> But IMHO the
>
> "* however, the PELT windows are not aligned between grq and gse."
>
> should only apply to update_tg_cfs_util() and update_tg_cfs_runnable().
> And attach_entity_load_avg() (for cfs_rq and se).
>
> They seem to be special since we derive divider from a cfs_rq PELT value
> and use it for a se PELT value.

hmmm... There is nothing special here.

When se is attached to cfs_rq, they both have the same divider because
they use the same clock.

>
> I assume this fact is specifically worth highlighting with a comment. I
> mean the fact we can do this because the decay windows are actually aligned.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-24  8:54       ` Vincent Guittot
@ 2020-04-24  9:06         ` Dietmar Eggemann
  0 siblings, 0 replies; 10+ messages in thread
From: Dietmar Eggemann @ 2020-04-24  9:06 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel



On 24/04/2020 10:54, Vincent Guittot wrote:
> On Fri, 24 Apr 2020 at 10:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>>
>> On 24/04/2020 09:37, Vincent Guittot wrote:
>>> On Thu, 23 Apr 2020 at 21:29, Peter Zijlstra <peterz@infradead.org> wrote:
>>>>
>>>> On Wed, Apr 22, 2020 at 05:14:01PM +0200, Vincent Guittot wrote:

[...]

>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index 02f323b85b6d..ca6aa89c88f2 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>> @@ -3441,52 +3441,38 @@ static inline void
>>>>>  update_tg_cfs_util(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq)
>>>>>  {
>>>>>       long delta = gcfs_rq->avg.util_avg - se->avg.util_avg;
>>>>> +     u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
>>>>>
>>>>>       /* Nothing to update */
>>>>>       if (!delta)
>>>>>               return;
>>>>>
>>>>> -     /*
>>>>> -      * The relation between sum and avg is:
>>>>> -      *
>>>>> -      *   LOAD_AVG_MAX - 1024 + sa->period_contrib
>>>>> -      *
>>>>> -      * however, the PELT windows are not aligned between grq and gse.
>>>>> -      */
>>>>
>>>> Instead of deleting this, could we perhaps extend it?
>>>
>>> In fact, this is not the only place in fair.c that uses this rule to
>>> align _avg and _sum but others don't have any special comment.
>>>
>>> I can add a more detailed description of this relation for
>>> ___update_load_avg() in pelt.c and make a ref to this in all places in
>>> fair.c that use this rule which are :
>>> - update_tg_cfs_util
>>> - update_tg_cfs_runnable
>>> - update_cfs_rq_load_avg
>>> - attach_entity_load_avg
>>> - reweight_entity
>>
>> But IMHO the
>>
>> "* however, the PELT windows are not aligned between grq and gse."
>>
>> should only apply to update_tg_cfs_util() and update_tg_cfs_runnable().
>> And attach_entity_load_avg() (for cfs_rq and se).
>>
>> They seem to be special since we derive divider from a cfs_rq PELT value
>> and use it for a se PELT value.
> 
> hmmm... There is nothing special here.
> 
> When se is attached to cfs_rq, they both have the same divider because
> they use the same clock.

That's true.

But exactly this might deserve this comment. Otherwise people might
wonder why you can do a

    u32 divider = LOAD_AVG_MAX - 1024 + *cfs_rq*->avg.period_contrib;

and use it for instance in:

    *se*->avg.util_sum = se->avg.util_avg * divider;

In update_cfs_rq_load_avg() and reweight_entity() we derive 'divider'
from the same 'sched_avg' we use it on later.

    u32 divider = LOAD_AVG_MAX - 1024 + sa->period_contrib;

    sub_positive(&sa->load_sum, r * divider);

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-23 16:17   ` Vincent Guittot
@ 2020-04-24 12:07     ` Dietmar Eggemann
  2020-04-24 12:47       ` Vincent Guittot
  0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2020-04-24 12:07 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel

On 23/04/2020 18:17, Vincent Guittot wrote:
> On Thu, 23 Apr 2020 at 16:30, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>>
>> On 22/04/2020 17:14, Vincent Guittot wrote:

[..]

>>     gcfs --> tg --> gse
>>      ________________|
>>      |
>>      V
>>
>>     cfs ---> tg (root)
>>
>>      |
>>      V
>>
>>     rq
>>
> 
> child cfs_rq aka gcfs_rq
>   |
> gse: group entity that represents child cfs_rq in parent cfs_rq
>   |
>   v
> parent cfs_rq aka cfs_rq

OK, I see. Maybe it's clearer to refer to child cfs_rq as gcfs_rq in
this context.

[...]

>>>       /* Set new sched_entity's utilization */
>>>       se->avg.util_avg = gcfs_rq->avg.util_avg;
>>> -     se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
>>> +     se->avg.util_sum = se->avg.util_avg * divider;
>>
>> divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.
> 
> we sync the util_avg of gse with the new util_avg of gcfs_rq but gse
> is attached to cfs_rq and as a result we have to use cfs_rq's
> period_contrib

I agree.

But the decay windows (avg.last_update_time, avg.period_contrib) of
cfs_rq and gcfs_rq are not always aligned, I guess?

I see they are not after the online_fair_sched_group() ->
attach_entity_cfs_rq() but later the are in sync as well.

I ran a couple of different rt-app taskgroup tests and try to

BUG_ON(se->avg.period_contrib != gcfs_rq->avg.period_contrib);
BUG_ON(se->avg.last_update_time != gcfs_rq->avg.last_update_time)

in update_tg_cfs_util() but they didn't trigger so far.

Both, cfs_rq and gcfs_rq are in sync in update_tg_cfs_util() before and
during a task runs in gcfs_rq.

Are there cases where this wouldn't necessary happen in
update_tg_cfs_util(), maybe a more complicated testcase?

>> But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
>> should work.
>>
>>>       /* Update parent cfs_rq utilization */
>>>       add_positive(&cfs_rq->avg.util_avg, delta);
>>> -     cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
>>> +     cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;
>>
>> Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
>> and cfs_rq is always the same in update_tg_cfs_[util\|runnable].
>>
>> So that means the PELT windows are aligned for cfs_rqs and group se's?
> 
> We want to align util_avg with util_sum and period_contrib otherwise
> we might have some unalignment. It's quite similarly to what is done
> in attach_entity_load_avg()

I agree.

>> And if we want to enforce this for cfs_rq and task, we have
>> sync_entity_load_avg().
> 
> It's not a matter of syncing the last_update_time

I agree, this is not what you want to achieve here.
But syncing 'last_update_time' and 'period_contrib' is what I understand
as aligning the decay window (like in attach_entity_load_avg()).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating
  2020-04-24 12:07     ` Dietmar Eggemann
@ 2020-04-24 12:47       ` Vincent Guittot
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Guittot @ 2020-04-24 12:47 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel

On Fri, 24 Apr 2020 at 14:07, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 23/04/2020 18:17, Vincent Guittot wrote:
> > On Thu, 23 Apr 2020 at 16:30, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 22/04/2020 17:14, Vincent Guittot wrote:
>
> [..]
>
> >>     gcfs --> tg --> gse
> >>      ________________|
> >>      |
> >>      V
> >>
> >>     cfs ---> tg (root)
> >>
> >>      |
> >>      V
> >>
> >>     rq
> >>
> >
> > child cfs_rq aka gcfs_rq
> >   |
> > gse: group entity that represents child cfs_rq in parent cfs_rq
> >   |
> >   v
> > parent cfs_rq aka cfs_rq
>
> OK, I see. Maybe it's clearer to refer to child cfs_rq as gcfs_rq in
> this context.
>
> [...]
>
> >>>       /* Set new sched_entity's utilization */
> >>>       se->avg.util_avg = gcfs_rq->avg.util_avg;
> >>> -     se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
> >>> +     se->avg.util_sum = se->avg.util_avg * divider;
> >>
> >> divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.
> >
> > we sync the util_avg of gse with the new util_avg of gcfs_rq but gse
> > is attached to cfs_rq and as a result we have to use cfs_rq's
> > period_contrib
>
> I agree.
>
> But the decay windows (avg.last_update_time, avg.period_contrib) of
> cfs_rq and gcfs_rq are not always aligned, I guess?
>
> I see they are not after the online_fair_sched_group() ->
> attach_entity_cfs_rq() but later the are in sync as well.

cfs_rq and gcfs_rq use 2 different clocks:
- cfs_rq_clock_pelt(cfs_rq)
- cfs_rq_clock_pelt(gcfs_rq)

And they can be different in some cases like cfs bandwidth

>
> I ran a couple of different rt-app taskgroup tests and try to
>
> BUG_ON(se->avg.period_contrib != gcfs_rq->avg.period_contrib);
> BUG_ON(se->avg.last_update_time != gcfs_rq->avg.last_update_time)
>
> in update_tg_cfs_util() but they didn't trigger so far.
>
> Both, cfs_rq and gcfs_rq are in sync in update_tg_cfs_util() before and
> during a task runs in gcfs_rq.
>
> Are there cases where this wouldn't necessary happen in
> update_tg_cfs_util(), maybe a more complicated testcase?
>
> >> But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
> >> should work.
> >>
> >>>       /* Update parent cfs_rq utilization */
> >>>       add_positive(&cfs_rq->avg.util_avg, delta);
> >>> -     cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
> >>> +     cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;
> >>
> >> Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
> >> and cfs_rq is always the same in update_tg_cfs_[util\|runnable].
> >>
> >> So that means the PELT windows are aligned for cfs_rqs and group se's?
> >
> > We want to align util_avg with util_sum and period_contrib otherwise
> > we might have some unalignment. It's quite similarly to what is done
> > in attach_entity_load_avg()
>
> I agree.
>
> >> And if we want to enforce this for cfs_rq and task, we have
> >> sync_entity_load_avg().
> >
> > It's not a matter of syncing the last_update_time
>
> I agree, this is not what you want to achieve here.
> But syncing 'last_update_time' and 'period_contrib' is what I understand
> as aligning the decay window (like in attach_entity_load_avg()).

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-04-24 12:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-22 15:14 [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating Vincent Guittot
2020-04-23 14:30 ` Dietmar Eggemann
2020-04-23 16:17   ` Vincent Guittot
2020-04-24 12:07     ` Dietmar Eggemann
2020-04-24 12:47       ` Vincent Guittot
2020-04-23 19:29 ` Peter Zijlstra
2020-04-24  7:37   ` Vincent Guittot
2020-04-24  8:41     ` Dietmar Eggemann
2020-04-24  8:54       ` Vincent Guittot
2020-04-24  9:06         ` Dietmar Eggemann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).