* [PATCH] sched/fair: fix runnable_avg for throttled cfs
@ 2020-02-26 18:16 Vincent Guittot
2020-02-26 19:04 ` bsegall
0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2020-02-26 18:16 UTC (permalink / raw)
To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, linux-kernel
Cc: pauld, parth, valentin.schneider, hdanton, zhout, Vincent Guittot
When a cfs_rq is throttled, its group entity is dequeued and its running
tasks are removed. We must update runnable_avg with current h_nr_running
and update group_se->runnable_weight with new h_nr_running at each level
of the hierarchy.
Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
This patch applies on top of tip/sched/core
kernel/sched/fair.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fcc968669aea..6d46974e9be7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
if (dequeue)
dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
+ else {
+ update_load_avg(qcfs_rq, se, 0);
+ se_update_runnable(se);
+ }
+
qcfs_rq->h_nr_running -= task_delta;
qcfs_rq->idle_h_nr_running -= idle_task_delta;
@@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
cfs_rq = cfs_rq_of(se);
if (enqueue)
enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
+ else {
+ update_load_avg(cfs_rq, se, 0);
+ se_update_runnable(se);
+ }
+
cfs_rq->h_nr_running += task_delta;
cfs_rq->idle_h_nr_running += idle_task_delta;
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-26 18:16 [PATCH] sched/fair: fix runnable_avg for throttled cfs Vincent Guittot
@ 2020-02-26 19:04 ` bsegall
2020-02-26 21:01 ` Vincent Guittot
0 siblings, 1 reply; 12+ messages in thread
From: bsegall @ 2020-02-26 19:04 UTC (permalink / raw)
To: Vincent Guittot
Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, linux-kernel, pauld, parth, valentin.schneider, hdanton,
zhout
Vincent Guittot <vincent.guittot@linaro.org> writes:
> When a cfs_rq is throttled, its group entity is dequeued and its running
> tasks are removed. We must update runnable_avg with current h_nr_running
> and update group_se->runnable_weight with new h_nr_running at each level
> of the hierarchy.
You'll also need to do this for task enqueue/dequeue inside of a
throttled hierarchy, I'm pretty sure.
>
> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> This patch applies on top of tip/sched/core
>
> kernel/sched/fair.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fcc968669aea..6d46974e9be7 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
>
> if (dequeue)
> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> + else {
> + update_load_avg(qcfs_rq, se, 0);
> + se_update_runnable(se);
> + }
> +
> qcfs_rq->h_nr_running -= task_delta;
> qcfs_rq->idle_h_nr_running -= idle_task_delta;
>
> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> cfs_rq = cfs_rq_of(se);
> if (enqueue)
> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> + else {
> + update_load_avg(cfs_rq, se, 0);
> + se_update_runnable(se);
> + }
> +
> cfs_rq->h_nr_running += task_delta;
> cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-26 19:04 ` bsegall
@ 2020-02-26 21:01 ` Vincent Guittot
2020-02-27 11:20 ` Dietmar Eggemann
2020-02-27 19:13 ` bsegall
0 siblings, 2 replies; 12+ messages in thread
From: Vincent Guittot @ 2020-02-26 21:01 UTC (permalink / raw)
To: Ben Segall
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
Steven Rostedt, Mel Gorman, linux-kernel, Phil Auld, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
>
> Vincent Guittot <vincent.guittot@linaro.org> writes:
>
> > When a cfs_rq is throttled, its group entity is dequeued and its running
> > tasks are removed. We must update runnable_avg with current h_nr_running
> > and update group_se->runnable_weight with new h_nr_running at each level
> > of the hierarchy.
>
> You'll also need to do this for task enqueue/dequeue inside of a
> throttled hierarchy, I'm pretty sure.
AFAICT, this is already done with patch "sched/pelt: Add a new
runnable average signal" when task is enqueued/dequeued inside a
throttled hierarchy
>
> >
> > Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> > This patch applies on top of tip/sched/core
> >
> > kernel/sched/fair.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index fcc968669aea..6d46974e9be7 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
> >
> > if (dequeue)
> > dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> > + else {
> > + update_load_avg(qcfs_rq, se, 0);
> > + se_update_runnable(se);
> > + }
> > +
> > qcfs_rq->h_nr_running -= task_delta;
> > qcfs_rq->idle_h_nr_running -= idle_task_delta;
> >
> > @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> > cfs_rq = cfs_rq_of(se);
> > if (enqueue)
> > enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> > + else {
> > + update_load_avg(cfs_rq, se, 0);
>
>
> > + se_update_runnable(se);
> > + }
> > +
> > cfs_rq->h_nr_running += task_delta;
> > cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-26 21:01 ` Vincent Guittot
@ 2020-02-27 11:20 ` Dietmar Eggemann
2020-02-27 13:10 ` Vincent Guittot
[not found] ` <20200227131228.GA5872@geo.homenetwork>
2020-02-27 19:13 ` bsegall
1 sibling, 2 replies; 12+ messages in thread
From: Dietmar Eggemann @ 2020-02-27 11:20 UTC (permalink / raw)
To: Vincent Guittot, Ben Segall
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Steven Rostedt,
Mel Gorman, linux-kernel, Phil Auld, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On 26.02.20 21:01, Vincent Guittot wrote:
> On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
>>
>> Vincent Guittot <vincent.guittot@linaro.org> writes:
>>
>>> When a cfs_rq is throttled, its group entity is dequeued and its running
>>> tasks are removed. We must update runnable_avg with current h_nr_running
>>> and update group_se->runnable_weight with new h_nr_running at each level
^^^
Shouldn't this be 'current' rather 'new' h_nr_running for
group_se->runnable_weight? IMHO, you want to cache the current value
before you add/subtract task_delta.
>>> of the hierarchy.
>>
>> You'll also need to do this for task enqueue/dequeue inside of a
>> throttled hierarchy, I'm pretty sure.
>
> AFAICT, this is already done with patch "sched/pelt: Add a new
> runnable average signal" when task is enqueued/dequeued inside a
> throttled hierarchy
>
>>
>>>
>>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>> This patch applies on top of tip/sched/core
>>>
>>> kernel/sched/fair.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index fcc968669aea..6d46974e9be7 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
>>>
>>> if (dequeue)
>>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
>>> + else {
>>> + update_load_avg(qcfs_rq, se, 0);
>>> + se_update_runnable(se);
>>> + }
>>> +
>>> qcfs_rq->h_nr_running -= task_delta;
>>> qcfs_rq->idle_h_nr_running -= idle_task_delta;
>>>
>>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>>> cfs_rq = cfs_rq_of(se);
>>> if (enqueue)
>>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
>>> + else {
>>> + update_load_avg(cfs_rq, se, 0);
>>
>>
>>> + se_update_runnable(se);
>>> + }
>>> +
>>> cfs_rq->h_nr_running += task_delta;
>>> cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-27 11:20 ` Dietmar Eggemann
@ 2020-02-27 13:10 ` Vincent Guittot
2020-02-27 14:58 ` Vincent Guittot
[not found] ` <20200227131228.GA5872@geo.homenetwork>
1 sibling, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2020-02-27 13:10 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: Ben Segall, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Steven Rostedt, Mel Gorman, linux-kernel, Phil Auld, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On Thu, 27 Feb 2020 at 12:20, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 26.02.20 21:01, Vincent Guittot wrote:
> > On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
> >>
> >> Vincent Guittot <vincent.guittot@linaro.org> writes:
> >>
> >>> When a cfs_rq is throttled, its group entity is dequeued and its running
> >>> tasks are removed. We must update runnable_avg with current h_nr_running
> >>> and update group_se->runnable_weight with new h_nr_running at each level
>
> ^^^
>
> Shouldn't this be 'current' rather 'new' h_nr_running for
> group_se->runnable_weight? IMHO, you want to cache the current value
> before you add/subtract task_delta.
hmm... it can't be current in both places. In my explanation,
"current" means the current situation when we started to throttle cfs
and "new" means the new situation after we finished to throttle the
cfs. I should probably use old and new to prevent any
misunderstanding.
That being said, we need to update runnable_avg with the old
h_nr_running: the one before we started to throttle the cfs which is
the value saved in group_se->runnable_weight. Once we have updated
runnable_avg, we save the new h_nr_running in
group_se->runnable_weight that will be used for next updates.
>
> >>> of the hierarchy.
> >>
> >> You'll also need to do this for task enqueue/dequeue inside of a
> >> throttled hierarchy, I'm pretty sure.
> >
> > AFAICT, this is already done with patch "sched/pelt: Add a new
> > runnable average signal" when task is enqueued/dequeued inside a
> > throttled hierarchy
> >
> >>
> >>>
> >>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> >>> ---
> >>> This patch applies on top of tip/sched/core
> >>>
> >>> kernel/sched/fair.c | 10 ++++++++++
> >>> 1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> index fcc968669aea..6d46974e9be7 100644
> >>> --- a/kernel/sched/fair.c
> >>> +++ b/kernel/sched/fair.c
> >>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
> >>>
> >>> if (dequeue)
> >>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> >>> + else {
> >>> + update_load_avg(qcfs_rq, se, 0);
> >>> + se_update_runnable(se);
> >>> + }
> >>> +
> >>> qcfs_rq->h_nr_running -= task_delta;
> >>> qcfs_rq->idle_h_nr_running -= idle_task_delta;
> >>>
> >>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> >>> cfs_rq = cfs_rq_of(se);
> >>> if (enqueue)
> >>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> >>> + else {
> >>> + update_load_avg(cfs_rq, se, 0);
> >>
> >>
> >>> + se_update_runnable(se);
> >>> + }
> >>> +
> >>> cfs_rq->h_nr_running += task_delta;
> >>> cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
[not found] ` <20200227131228.GA5872@geo.homenetwork>
@ 2020-02-27 13:12 ` Vincent Guittot
2020-02-27 15:15 ` Dietmar Eggemann
0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2020-02-27 13:12 UTC (permalink / raw)
To: Tao Zhou
Cc: Dietmar Eggemann, Ben Segall, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Steven Rostedt, Mel Gorman, linux-kernel, Phil Auld,
Parth Shah, Valentin Schneider, Hillf Danton
On Thu, 27 Feb 2020 at 14:10, Tao Zhou <zhout@vivaldi.net> wrote:
>
> Hi Dietmar,
>
> On Thu, Feb 27, 2020 at 11:20:05AM +0000, Dietmar Eggemann wrote:
> > On 26.02.20 21:01, Vincent Guittot wrote:
> > > On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
> > >>
> > >> Vincent Guittot <vincent.guittot@linaro.org> writes:
> > >>
> > >>> When a cfs_rq is throttled, its group entity is dequeued and its running
> > >>> tasks are removed. We must update runnable_avg with current h_nr_running
> > >>> and update group_se->runnable_weight with new h_nr_running at each level
> >
> > ^^^
> >
> > Shouldn't his be 'curren' rather 'new' h_nr_running for
> > group_se->runnable_weight? IMHO, you want to cache the current value
> > before you add/subtract task_delta.
>
> /me think Vincent is right. h_nr_running is updated in the previous
> level or out. The next level will use current h_nr_running to update
> runnable_avg and use the new group cfs_rq's h_nr_running which was
> updated in the previous level or out to update se runnable_weight.
Yes that's what I want to say
>
> update_h_nr_running (E.G. in enqueue_task_fair)
>
> throttle_cfs_rq()
> for() {
> update_load_avg()
> se_runnable() --> current h_nr_running
> update_se_runnable() --> use group cfs_rq h_nr_running
> updated in the prev level or out
> update_h_nr_running
> }
>
> Thanks,
> Tao
>
> > >>> of the hierarchy.
> > >>
> > >> You'll also need to do this for task enqueue/dequeue inside of a
> > >> throttled hierarchy, I'm pretty sure.
> > >
> > > AFAICT, this is already done with patch "sched/pelt: Add a new
> > > runnable average signal" when task is enqueued/dequeued inside a
> > > throttled hierarchy
> > >
> > >>
> > >>>
> > >>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> > >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > >>> ---
> > >>> This patch applies on top of tip/sched/core
> > >>>
> > >>> kernel/sched/fair.c | 10 ++++++++++
> > >>> 1 file changed, 10 insertions(+)
> > >>>
> > >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > >>> index fcc968669aea..6d46974e9be7 100644
> > >>> --- a/kernel/sched/fair.c
> > >>> +++ b/kernel/sched/fair.c
> > >>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
> > >>>
> > >>> if (dequeue)
> > >>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> > >>> + else {
> > >>> + update_load_avg(qcfs_rq, se, 0);
> > >>> + se_update_runnable(se);
> > >>> + }
> > >>> +
> > >>> qcfs_rq->h_nr_running -= task_delta;
> > >>> qcfs_rq->idle_h_nr_running -= idle_task_delta;
> > >>>
> > >>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> > >>> cfs_rq = cfs_rq_of(se);
> > >>> if (enqueue)
> > >>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> > >>> + else {
> > >>> + update_load_avg(cfs_rq, se, 0);
> > >>
> > >>
> > >>> + se_update_runnable(se);
> > >>> + }
> > >>> +
> > >>> cfs_rq->h_nr_running += task_delta;
> > >>> cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-27 13:10 ` Vincent Guittot
@ 2020-02-27 14:58 ` Vincent Guittot
2020-02-27 15:17 ` Dietmar Eggemann
2020-02-27 15:34 ` Phil Auld
0 siblings, 2 replies; 12+ messages in thread
From: Vincent Guittot @ 2020-02-27 14:58 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: Ben Segall, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Steven Rostedt, Mel Gorman, linux-kernel, Phil Auld, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On Thu, 27 Feb 2020 at 14:10, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Thu, 27 Feb 2020 at 12:20, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> >
> > On 26.02.20 21:01, Vincent Guittot wrote:
> > > On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
> > >>
> > >> Vincent Guittot <vincent.guittot@linaro.org> writes:
> > >>
> > >>> When a cfs_rq is throttled, its group entity is dequeued and its running
> > >>> tasks are removed. We must update runnable_avg with current h_nr_running
> > >>> and update group_se->runnable_weight with new h_nr_running at each level
> >
> > ^^^
> >
> > Shouldn't this be 'current' rather 'new' h_nr_running for
> > group_se->runnable_weight? IMHO, you want to cache the current value
> > before you add/subtract task_delta.
>
> hmm... it can't be current in both places. In my explanation,
> "current" means the current situation when we started to throttle cfs
> and "new" means the new situation after we finished to throttle the
> cfs. I should probably use old and new to prevent any
> misunderstanding.
I'm about to send a new version to fix some minor changes: The if
statement should have some { } as there are some on the else part
Would it be better for you if i use old and new instead of current and
new in the commit message ?
>
> That being said, we need to update runnable_avg with the old
> h_nr_running: the one before we started to throttle the cfs which is
> the value saved in group_se->runnable_weight. Once we have updated
> runnable_avg, we save the new h_nr_running in
> group_se->runnable_weight that will be used for next updates.
>
> >
> > >>> of the hierarchy.
> > >>
> > >> You'll also need to do this for task enqueue/dequeue inside of a
> > >> throttled hierarchy, I'm pretty sure.
> > >
> > > AFAICT, this is already done with patch "sched/pelt: Add a new
> > > runnable average signal" when task is enqueued/dequeued inside a
> > > throttled hierarchy
> > >
> > >>
> > >>>
> > >>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> > >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > >>> ---
> > >>> This patch applies on top of tip/sched/core
> > >>>
> > >>> kernel/sched/fair.c | 10 ++++++++++
> > >>> 1 file changed, 10 insertions(+)
> > >>>
> > >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > >>> index fcc968669aea..6d46974e9be7 100644
> > >>> --- a/kernel/sched/fair.c
> > >>> +++ b/kernel/sched/fair.c
> > >>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
> > >>>
> > >>> if (dequeue)
> > >>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> > >>> + else {
> > >>> + update_load_avg(qcfs_rq, se, 0);
> > >>> + se_update_runnable(se);
> > >>> + }
> > >>> +
> > >>> qcfs_rq->h_nr_running -= task_delta;
> > >>> qcfs_rq->idle_h_nr_running -= idle_task_delta;
> > >>>
> > >>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> > >>> cfs_rq = cfs_rq_of(se);
> > >>> if (enqueue)
> > >>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> > >>> + else {
> > >>> + update_load_avg(cfs_rq, se, 0);
> > >>
> > >>
> > >>> + se_update_runnable(se);
> > >>> + }
> > >>> +
> > >>> cfs_rq->h_nr_running += task_delta;
> > >>> cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-27 13:12 ` Vincent Guittot
@ 2020-02-27 15:15 ` Dietmar Eggemann
0 siblings, 0 replies; 12+ messages in thread
From: Dietmar Eggemann @ 2020-02-27 15:15 UTC (permalink / raw)
To: Vincent Guittot, Tao Zhou
Cc: Ben Segall, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Steven Rostedt, Mel Gorman, linux-kernel, Phil Auld, Parth Shah,
Valentin Schneider, Hillf Danton
On 27.02.20 13:12, Vincent Guittot wrote:
> On Thu, 27 Feb 2020 at 14:10, Tao Zhou <zhout@vivaldi.net> wrote:
>>
>> Hi Dietmar,
>>
>> On Thu, Feb 27, 2020 at 11:20:05AM +0000, Dietmar Eggemann wrote:
>>> On 26.02.20 21:01, Vincent Guittot wrote:
>>>> On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
>>>>>
>>>>> Vincent Guittot <vincent.guittot@linaro.org> writes:
>>>>>
>>>>>> When a cfs_rq is throttled, its group entity is dequeued and its running
>>>>>> tasks are removed. We must update runnable_avg with current h_nr_running
>>>>>> and update group_se->runnable_weight with new h_nr_running at each level
>>>
>>> ^^^
>>>
>>> Shouldn't his be 'curren' rather 'new' h_nr_running for
>>> group_se->runnable_weight? IMHO, you want to cache the current value
>>> before you add/subtract task_delta.
>>
>> /me think Vincent is right. h_nr_running is updated in the previous
>> level or out. The next level will use current h_nr_running to update
>> runnable_avg and use the new group cfs_rq's h_nr_running which was
>> updated in the previous level or out to update se runnable_weight.
Ah OK, 'old' as in 'old' cached value se->runnable_weight and 'new' as
the 'new' se->runnable_weight which gets updated *after* update_load_avg
and before +/- task_delta.
So when we throttle e.g. /tg1/tg11
previous level is: /tg1/tg11
next level: /tg1
loop for /tg1:
for_each_sched_entity(se)
cfs_rq = cfs_rq_of(se);
update_load_avg(cfs_rq, se ...) <-- uses 'old' se->runnable_weight
se->runnable_weight = se->my_q->h_nr_running <-- 'new' value
(updated in previous
level, group cfs_rq)
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-27 14:58 ` Vincent Guittot
@ 2020-02-27 15:17 ` Dietmar Eggemann
2020-02-27 15:34 ` Phil Auld
1 sibling, 0 replies; 12+ messages in thread
From: Dietmar Eggemann @ 2020-02-27 15:17 UTC (permalink / raw)
To: Vincent Guittot
Cc: Ben Segall, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Steven Rostedt, Mel Gorman, linux-kernel, Phil Auld, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On 27.02.20 14:58, Vincent Guittot wrote:
> On Thu, 27 Feb 2020 at 14:10, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
>>
>> On Thu, 27 Feb 2020 at 12:20, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>>>
>>> On 26.02.20 21:01, Vincent Guittot wrote:
>>>> On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
>>>>>
>>>>> Vincent Guittot <vincent.guittot@linaro.org> writes:
[...]
>>> Shouldn't this be 'current' rather 'new' h_nr_running for
>>> group_se->runnable_weight? IMHO, you want to cache the current value
>>> before you add/subtract task_delta.
>>
>> hmm... it can't be current in both places. In my explanation,
>> "current" means the current situation when we started to throttle cfs
>> and "new" means the new situation after we finished to throttle the
>> cfs. I should probably use old and new to prevent any
>> misunderstanding.
>
> I'm about to send a new version to fix some minor changes: The if
> statement should have some { } as there are some on the else part
>
> Would it be better for you if i use old and new instead of current and
> new in the commit message ?
Personally yes, but now I understand the other wording as well. Thanks.
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-27 14:58 ` Vincent Guittot
2020-02-27 15:17 ` Dietmar Eggemann
@ 2020-02-27 15:34 ` Phil Auld
2020-02-27 15:39 ` Vincent Guittot
1 sibling, 1 reply; 12+ messages in thread
From: Phil Auld @ 2020-02-27 15:34 UTC (permalink / raw)
To: Vincent Guittot
Cc: Dietmar Eggemann, Ben Segall, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Steven Rostedt, Mel Gorman, linux-kernel, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On Thu, Feb 27, 2020 at 03:58:02PM +0100 Vincent Guittot wrote:
> On Thu, 27 Feb 2020 at 14:10, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> >
> > On Thu, 27 Feb 2020 at 12:20, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> > >
> > > On 26.02.20 21:01, Vincent Guittot wrote:
> > > > On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
> > > >>
> > > >> Vincent Guittot <vincent.guittot@linaro.org> writes:
> > > >>
> > > >>> When a cfs_rq is throttled, its group entity is dequeued and its running
> > > >>> tasks are removed. We must update runnable_avg with current h_nr_running
> > > >>> and update group_se->runnable_weight with new h_nr_running at each level
> > >
> > > ^^^
> > >
> > > Shouldn't this be 'current' rather 'new' h_nr_running for
> > > group_se->runnable_weight? IMHO, you want to cache the current value
> > > before you add/subtract task_delta.
> >
> > hmm... it can't be current in both places. In my explanation,
> > "current" means the current situation when we started to throttle cfs
> > and "new" means the new situation after we finished to throttle the
> > cfs. I should probably use old and new to prevent any
> > misunderstanding.
>
> I'm about to send a new version to fix some minor changes: The if
> statement should have some { } as there are some on the else part
>
> Would it be better for you if i use old and new instead of current and
> new in the commit message ?
>
Seems better to me. You could also consider "the old" and "the new".
Cheers,
Phil
> >
> > That being said, we need to update runnable_avg with the old
> > h_nr_running: the one before we started to throttle the cfs which is
> > the value saved in group_se->runnable_weight. Once we have updated
> > runnable_avg, we save the new h_nr_running in
> > group_se->runnable_weight that will be used for next updates.
> >
> > >
> > > >>> of the hierarchy.
> > > >>
> > > >> You'll also need to do this for task enqueue/dequeue inside of a
> > > >> throttled hierarchy, I'm pretty sure.
> > > >
> > > > AFAICT, this is already done with patch "sched/pelt: Add a new
> > > > runnable average signal" when task is enqueued/dequeued inside a
> > > > throttled hierarchy
> > > >
> > > >>
> > > >>>
> > > >>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> > > >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > > >>> ---
> > > >>> This patch applies on top of tip/sched/core
> > > >>>
> > > >>> kernel/sched/fair.c | 10 ++++++++++
> > > >>> 1 file changed, 10 insertions(+)
> > > >>>
> > > >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > >>> index fcc968669aea..6d46974e9be7 100644
> > > >>> --- a/kernel/sched/fair.c
> > > >>> +++ b/kernel/sched/fair.c
> > > >>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
> > > >>>
> > > >>> if (dequeue)
> > > >>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> > > >>> + else {
> > > >>> + update_load_avg(qcfs_rq, se, 0);
> > > >>> + se_update_runnable(se);
> > > >>> + }
> > > >>> +
> > > >>> qcfs_rq->h_nr_running -= task_delta;
> > > >>> qcfs_rq->idle_h_nr_running -= idle_task_delta;
> > > >>>
> > > >>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> > > >>> cfs_rq = cfs_rq_of(se);
> > > >>> if (enqueue)
> > > >>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> > > >>> + else {
> > > >>> + update_load_avg(cfs_rq, se, 0);
> > > >>
> > > >>
> > > >>> + se_update_runnable(se);
> > > >>> + }
> > > >>> +
> > > >>> cfs_rq->h_nr_running += task_delta;
> > > >>> cfs_rq->idle_h_nr_running += idle_task_delta;
>
--
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-27 15:34 ` Phil Auld
@ 2020-02-27 15:39 ` Vincent Guittot
0 siblings, 0 replies; 12+ messages in thread
From: Vincent Guittot @ 2020-02-27 15:39 UTC (permalink / raw)
To: Phil Auld
Cc: Dietmar Eggemann, Ben Segall, Ingo Molnar, Peter Zijlstra,
Juri Lelli, Steven Rostedt, Mel Gorman, linux-kernel, Parth Shah,
Valentin Schneider, Hillf Danton, zhout
On Thu, 27 Feb 2020 at 16:34, Phil Auld <pauld@redhat.com> wrote:
>
> On Thu, Feb 27, 2020 at 03:58:02PM +0100 Vincent Guittot wrote:
> > On Thu, 27 Feb 2020 at 14:10, Vincent Guittot
> > <vincent.guittot@linaro.org> wrote:
> > >
> > > On Thu, 27 Feb 2020 at 12:20, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> > > >
> > > > On 26.02.20 21:01, Vincent Guittot wrote:
> > > > > On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
> > > > >>
> > > > >> Vincent Guittot <vincent.guittot@linaro.org> writes:
> > > > >>
> > > > >>> When a cfs_rq is throttled, its group entity is dequeued and its running
> > > > >>> tasks are removed. We must update runnable_avg with current h_nr_running
> > > > >>> and update group_se->runnable_weight with new h_nr_running at each level
> > > >
> > > > ^^^
> > > >
> > > > Shouldn't this be 'current' rather 'new' h_nr_running for
> > > > group_se->runnable_weight? IMHO, you want to cache the current value
> > > > before you add/subtract task_delta.
> > >
> > > hmm... it can't be current in both places. In my explanation,
> > > "current" means the current situation when we started to throttle cfs
> > > and "new" means the new situation after we finished to throttle the
> > > cfs. I should probably use old and new to prevent any
> > > misunderstanding.
> >
> > I'm about to send a new version to fix some minor changes: The if
> > statement should have some { } as there are some on the else part
> >
> > Would it be better for you if i use old and new instead of current and
> > new in the commit message ?
> >
>
> Seems better to me. You could also consider "the old" and "the new".
ok, will do
>
> Cheers,
> Phil
>
> > >
> > > That being said, we need to update runnable_avg with the old
> > > h_nr_running: the one before we started to throttle the cfs which is
> > > the value saved in group_se->runnable_weight. Once we have updated
> > > runnable_avg, we save the new h_nr_running in
> > > group_se->runnable_weight that will be used for next updates.
> > >
> > > >
> > > > >>> of the hierarchy.
> > > > >>
> > > > >> You'll also need to do this for task enqueue/dequeue inside of a
> > > > >> throttled hierarchy, I'm pretty sure.
> > > > >
> > > > > AFAICT, this is already done with patch "sched/pelt: Add a new
> > > > > runnable average signal" when task is enqueued/dequeued inside a
> > > > > throttled hierarchy
> > > > >
> > > > >>
> > > > >>>
> > > > >>> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
> > > > >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > > > >>> ---
> > > > >>> This patch applies on top of tip/sched/core
> > > > >>>
> > > > >>> kernel/sched/fair.c | 10 ++++++++++
> > > > >>> 1 file changed, 10 insertions(+)
> > > > >>>
> > > > >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > >>> index fcc968669aea..6d46974e9be7 100644
> > > > >>> --- a/kernel/sched/fair.c
> > > > >>> +++ b/kernel/sched/fair.c
> > > > >>> @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
> > > > >>>
> > > > >>> if (dequeue)
> > > > >>> dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
> > > > >>> + else {
> > > > >>> + update_load_avg(qcfs_rq, se, 0);
> > > > >>> + se_update_runnable(se);
> > > > >>> + }
> > > > >>> +
> > > > >>> qcfs_rq->h_nr_running -= task_delta;
> > > > >>> qcfs_rq->idle_h_nr_running -= idle_task_delta;
> > > > >>>
> > > > >>> @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> > > > >>> cfs_rq = cfs_rq_of(se);
> > > > >>> if (enqueue)
> > > > >>> enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> > > > >>> + else {
> > > > >>> + update_load_avg(cfs_rq, se, 0);
> > > > >>
> > > > >>
> > > > >>> + se_update_runnable(se);
> > > > >>> + }
> > > > >>> +
> > > > >>> cfs_rq->h_nr_running += task_delta;
> > > > >>> cfs_rq->idle_h_nr_running += idle_task_delta;
> >
>
> --
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
2020-02-26 21:01 ` Vincent Guittot
2020-02-27 11:20 ` Dietmar Eggemann
@ 2020-02-27 19:13 ` bsegall
1 sibling, 0 replies; 12+ messages in thread
From: bsegall @ 2020-02-27 19:13 UTC (permalink / raw)
To: Vincent Guittot
Cc: Ben Segall, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Dietmar Eggemann, Steven Rostedt, Mel Gorman, linux-kernel,
Phil Auld, Parth Shah, Valentin Schneider, Hillf Danton, zhout
Vincent Guittot <vincent.guittot@linaro.org> writes:
> On Wed, 26 Feb 2020 at 20:04, <bsegall@google.com> wrote:
>>
>> Vincent Guittot <vincent.guittot@linaro.org> writes:
>>
>> > When a cfs_rq is throttled, its group entity is dequeued and its running
>> > tasks are removed. We must update runnable_avg with current h_nr_running
>> > and update group_se->runnable_weight with new h_nr_running at each level
>> > of the hierarchy.
>>
>> You'll also need to do this for task enqueue/dequeue inside of a
>> throttled hierarchy, I'm pretty sure.
>
> AFAICT, this is already done with patch "sched/pelt: Add a new
> runnable average signal" when task is enqueued/dequeued inside a
> throttled hierarchy
Right, I misread what exactly was going on.
>
>>
>> >
>> > Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal")
>> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>> > ---
>> > This patch applies on top of tip/sched/core
>> >
>> > kernel/sched/fair.c | 10 ++++++++++
>> > 1 file changed, 10 insertions(+)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index fcc968669aea..6d46974e9be7 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -4703,6 +4703,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
>> >
>> > if (dequeue)
>> > dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
>> > + else {
>> > + update_load_avg(qcfs_rq, se, 0);
>> > + se_update_runnable(se);
>> > + }
>> > +
>> > qcfs_rq->h_nr_running -= task_delta;
>> > qcfs_rq->idle_h_nr_running -= idle_task_delta;
>> >
>> > @@ -4772,6 +4777,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>> > cfs_rq = cfs_rq_of(se);
>> > if (enqueue)
>> > enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
>> > + else {
>> > + update_load_avg(cfs_rq, se, 0);
>>
>>
>> > + se_update_runnable(se);
>> > + }
>> > +
>> > cfs_rq->h_nr_running += task_delta;
>> > cfs_rq->idle_h_nr_running += idle_task_delta;
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-02-27 19:14 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-26 18:16 [PATCH] sched/fair: fix runnable_avg for throttled cfs Vincent Guittot
2020-02-26 19:04 ` bsegall
2020-02-26 21:01 ` Vincent Guittot
2020-02-27 11:20 ` Dietmar Eggemann
2020-02-27 13:10 ` Vincent Guittot
2020-02-27 14:58 ` Vincent Guittot
2020-02-27 15:17 ` Dietmar Eggemann
2020-02-27 15:34 ` Phil Auld
2020-02-27 15:39 ` Vincent Guittot
[not found] ` <20200227131228.GA5872@geo.homenetwork>
2020-02-27 13:12 ` Vincent Guittot
2020-02-27 15:15 ` Dietmar Eggemann
2020-02-27 19:13 ` bsegall
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.