* [PATCH] sched/pelt: fix false running accounting in PELT
@ 2017-06-30 13:58 Vincent Guittot
2017-07-01 5:06 ` [PATCH v2] sched/pelt: fix false running accounting Vincent Guittot
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Vincent Guittot @ 2017-06-30 13:58 UTC (permalink / raw)
To: peterz, mingo, linux-kernel
Cc: dietmar.eggemann, Morten.Rasmussen, Vincent Guittot
The running state is a subset of runnable state which means that running
can't be set if runnable (weight) is cleared. There are corner cases
where the current sched_entity has been already dequeued but cfs_rq->curr
has not been updated yet and still points to the dequeued sched_entity.
If ___update_load_avg is called at that time, weight will be 0 and running
will be set which is not possible.
This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
The current sched_entity has been dequeued so se->on_rq is cleared and
cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
cleared when picking the idle thread). Because the cfs_rq becomes idle,
idle_balance() is called and ends up to call update_blocked_averages()
with these wrong running and runnable states.
Add a test in ___update_load_avg to correct the running state in this case.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/fair.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 008c514..5fdcb42 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2968,6 +2968,24 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
sa->last_update_time += delta << 10;
/*
+ * running is a subset of runnable (weight) so running can't be set if
+ * runnable is clear. But there are some corner cases where the current
+ * se has been already dequeued but cfs_rq->curr still points to it.
+ * This means that weight will be 0 but not running for a sched_entity
+ * but also for a cfs_rq if the latter becomes idle. As an example,
+ * this happens during idle_balance() which calls
+ * update_blocked_averages()
+ */
+ if (weight)
+ running = 1;
+
+ /*
+ * Scale time to reflect the amount a computation effectively done
+ * during the time slot at current capacity
+ */
+ delta = scale_time(delta, cpu, sa, weight, running);
+
+ /*
* Now we know we crossed measurement unit boundaries. The *_avg
* accrues by two steps:
*
--
2.7.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2] sched/pelt: fix false running accounting
2017-06-30 13:58 [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
@ 2017-07-01 5:06 ` Vincent Guittot
2017-07-04 7:27 ` Peter Zijlstra
2017-08-10 12:07 ` [tip:sched/core] sched/pelt: Fix " tip-bot for Vincent Guittot
2017-07-01 5:09 ` [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
2017-07-01 16:57 ` kbuild test robot
2 siblings, 2 replies; 12+ messages in thread
From: Vincent Guittot @ 2017-07-01 5:06 UTC (permalink / raw)
To: peterz, mingo, linux-kernel
Cc: dietmar.eggemann, Morten.Rasmussen, Vincent Guittot
The running state is a subset of runnable state which means that running
can't be set if runnable (weight) is cleared. There are corner cases
where the current sched_entity has been already dequeued but cfs_rq->curr
has not been updated yet and still points to the dequeued sched_entity.
If ___update_load_avg is called at that time, weight will be 0 and running
will be set which is not possible.
This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
The current sched_entity has been dequeued so se->on_rq is cleared and
cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
cleared when picking the idle thread). Because the cfs_rq becomes idle,
idle_balance() is called and ends up to call update_blocked_averages()
with these wrong running and runnable states.
Add a test in ___update_load_avg to correct the running state in this case.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/fair.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 008c514..bc36a75 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2968,6 +2968,18 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
sa->last_update_time += delta << 10;
/*
+ * running is a subset of runnable (weight) so running can't be set if
+ * runnable is clear. But there are some corner cases where the current
+ * se has been already dequeued but cfs_rq->curr still points to it.
+ * This means that weight will be 0 but not running for a sched_entity
+ * but also for a cfs_rq if the latter becomes idle. As an example,
+ * this happens during idle_balance() which calls
+ * update_blocked_averages()
+ */
+ if (!weight)
+ running = 0;
+
+ /*
* Now we know we crossed measurement unit boundaries. The *_avg
* accrues by two steps:
*
--
2.7.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/pelt: fix false running accounting in PELT
2017-06-30 13:58 [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
2017-07-01 5:06 ` [PATCH v2] sched/pelt: fix false running accounting Vincent Guittot
@ 2017-07-01 5:09 ` Vincent Guittot
2017-07-01 16:57 ` kbuild test robot
2 siblings, 0 replies; 12+ messages in thread
From: Vincent Guittot @ 2017-07-01 5:09 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, linux-kernel
Cc: Dietmar Eggemann, Morten Rasmussen, Vincent Guittot
On 30 June 2017 at 15:58, Vincent Guittot <vincent.guittot@linaro.org> wrote:
> The running state is a subset of runnable state which means that running
> can't be set if runnable (weight) is cleared. There are corner cases
> where the current sched_entity has been already dequeued but cfs_rq->curr
> has not been updated yet and still points to the dequeued sched_entity.
> If ___update_load_avg is called at that time, weight will be 0 and running
> will be set which is not possible.
>
> This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
> The current sched_entity has been dequeued so se->on_rq is cleared and
> cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
> cleared when picking the idle thread). Because the cfs_rq becomes idle,
> idle_balance() is called and ends up to call update_blocked_averages()
> with these wrong running and runnable states.
>
> Add a test in ___update_load_avg to correct the running state in this case.*
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
The v1 is just wrong. I have sent a v2 with correct patch
sorry for the disturb
Vincent
> ---
> kernel/sched/fair.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 008c514..5fdcb42 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2968,6 +2968,24 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> sa->last_update_time += delta << 10;
>
> /*
> + * running is a subset of runnable (weight) so running can't be set if
> + * runnable is clear. But there are some corner cases where the current
> + * se has been already dequeued but cfs_rq->curr still points to it.
> + * This means that weight will be 0 but not running for a sched_entity
> + * but also for a cfs_rq if the latter becomes idle. As an example,
> + * this happens during idle_balance() which calls
> + * update_blocked_averages()
> + */
> + if (weight)
> + running = 1;
> +
> + /*
> + * Scale time to reflect the amount a computation effectively done
> + * during the time slot at current capacity
> + */
> + delta = scale_time(delta, cpu, sa, weight, running);
> +
> + /*
> * Now we know we crossed measurement unit boundaries. The *_avg
> * accrues by two steps:
> *
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] sched/pelt: fix false running accounting in PELT
2017-06-30 13:58 [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
2017-07-01 5:06 ` [PATCH v2] sched/pelt: fix false running accounting Vincent Guittot
2017-07-01 5:09 ` [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
@ 2017-07-01 16:57 ` kbuild test robot
2 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2017-07-01 16:57 UTC (permalink / raw)
To: Vincent Guittot
Cc: kbuild-all, peterz, mingo, linux-kernel, dietmar.eggemann,
Morten.Rasmussen, Vincent Guittot
[-- Attachment #1: Type: text/plain, Size: 1424 bytes --]
Hi Vincent,
[auto build test ERROR on tip/sched/core]
[also build test ERROR on v4.12-rc7 next-20170630]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Vincent-Guittot/sched-pelt-fix-false-running-accounting-in-PELT/20170702-002907
config: i386-randconfig-x078-07012120 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
kernel/sched/fair.c: In function '___update_load_avg':
>> kernel/sched/fair.c:2986:10: error: implicit declaration of function 'scale_time' [-Werror=implicit-function-declaration]
delta = scale_time(delta, cpu, sa, weight, running);
^~~~~~~~~~
cc1: some warnings being treated as errors
vim +/scale_time +2986 kernel/sched/fair.c
2980 running = 1;
2981
2982 /*
2983 * Scale time to reflect the amount a computation effectively done
2984 * during the time slot at current capacity
2985 */
> 2986 delta = scale_time(delta, cpu, sa, weight, running);
2987
2988 /*
2989 * Now we know we crossed measurement unit boundaries. The *_avg
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 22764 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-01 5:06 ` [PATCH v2] sched/pelt: fix false running accounting Vincent Guittot
@ 2017-07-04 7:27 ` Peter Zijlstra
2017-07-04 7:37 ` Vincent Guittot
2017-07-04 8:34 ` Peter Zijlstra
2017-08-10 12:07 ` [tip:sched/core] sched/pelt: Fix " tip-bot for Vincent Guittot
1 sibling, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2017-07-04 7:27 UTC (permalink / raw)
To: Vincent Guittot; +Cc: mingo, linux-kernel, dietmar.eggemann, Morten.Rasmussen
On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
> The running state is a subset of runnable state which means that running
> can't be set if runnable (weight) is cleared. There are corner cases
> where the current sched_entity has been already dequeued but cfs_rq->curr
> has not been updated yet and still points to the dequeued sched_entity.
> If ___update_load_avg is called at that time, weight will be 0 and running
> will be set which is not possible.
>
> This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
> The current sched_entity has been dequeued so se->on_rq is cleared and
> cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
> cleared when picking the idle thread). Because the cfs_rq becomes idle,
> idle_balance() is called and ends up to call update_blocked_averages()
> with these wrong running and runnable states.
>
> Add a test in ___update_load_avg to correct the running state in this case.
Cute, however did you find that ?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-04 7:27 ` Peter Zijlstra
@ 2017-07-04 7:37 ` Vincent Guittot
2017-07-04 8:34 ` Peter Zijlstra
1 sibling, 0 replies; 12+ messages in thread
From: Vincent Guittot @ 2017-07-04 7:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel, Dietmar Eggemann, Morten Rasmussen
On 4 July 2017 at 09:27, Peter Zijlstra <peterz@infradead.org> wrote:
> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
>> The running state is a subset of runnable state which means that running
>> can't be set if runnable (weight) is cleared. There are corner cases
>> where the current sched_entity has been already dequeued but cfs_rq->curr
>> has not been updated yet and still points to the dequeued sched_entity.
>> If ___update_load_avg is called at that time, weight will be 0 and running
>> will be set which is not possible.
>>
>> This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
>> The current sched_entity has been dequeued so se->on_rq is cleared and
>> cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
>> cleared when picking the idle thread). Because the cfs_rq becomes idle,
>> idle_balance() is called and ends up to call update_blocked_averages()
>> with these wrong running and runnable states.
>>
>> Add a test in ___update_load_avg to correct the running state in this case.
>
> Cute, however did you find that ?
In fact, while rebasing and running more tests on my patch "update
scale invariance of PELT" that changes how to scale the load and
utilization, I have seen that sometimes the utilization was increasing
but not the load when CPU was going into idle state because the
stolen_idle time was applied as idle time for load but running time
for utilization. This patch has highlighted the problem.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-04 7:27 ` Peter Zijlstra
2017-07-04 7:37 ` Vincent Guittot
@ 2017-07-04 8:34 ` Peter Zijlstra
2017-07-04 9:12 ` Vincent Guittot
1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2017-07-04 8:34 UTC (permalink / raw)
To: Vincent Guittot; +Cc: mingo, linux-kernel, dietmar.eggemann, Morten.Rasmussen
On Tue, Jul 04, 2017 at 09:27:07AM +0200, Peter Zijlstra wrote:
> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
> > The running state is a subset of runnable state which means that running
> > can't be set if runnable (weight) is cleared. There are corner cases
> > where the current sched_entity has been already dequeued but cfs_rq->curr
> > has not been updated yet and still points to the dequeued sched_entity.
> > If ___update_load_avg is called at that time, weight will be 0 and running
> > will be set which is not possible.
> >
> > This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
> > The current sched_entity has been dequeued so se->on_rq is cleared and
> > cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
> > cleared when picking the idle thread). Because the cfs_rq becomes idle,
> > idle_balance() is called and ends up to call update_blocked_averages()
> > with these wrong running and runnable states.
> >
> > Add a test in ___update_load_avg to correct the running state in this case.
>
> Cute, however did you find that ?
Hmm,.. could you give a little more detail?
Because if ->on_rq=0, we'll have done dequeue_task() which will have
done update_curr() with ->on_rq, weight and ->running consistently.
Then the above, inconsistent update should not happen, because delta=0.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-04 8:34 ` Peter Zijlstra
@ 2017-07-04 9:12 ` Vincent Guittot
2017-07-04 9:44 ` Peter Zijlstra
0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2017-07-04 9:12 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel, Dietmar Eggemann, Morten Rasmussen
On 4 July 2017 at 10:34, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Jul 04, 2017 at 09:27:07AM +0200, Peter Zijlstra wrote:
>> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
>> > The running state is a subset of runnable state which means that running
>> > can't be set if runnable (weight) is cleared. There are corner cases
>> > where the current sched_entity has been already dequeued but cfs_rq->curr
>> > has not been updated yet and still points to the dequeued sched_entity.
>> > If ___update_load_avg is called at that time, weight will be 0 and running
>> > will be set which is not possible.
>> >
>> > This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
>> > The current sched_entity has been dequeued so se->on_rq is cleared and
>> > cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
>> > cleared when picking the idle thread). Because the cfs_rq becomes idle,
>> > idle_balance() is called and ends up to call update_blocked_averages()
>> > with these wrong running and runnable states.
>> >
>> > Add a test in ___update_load_avg to correct the running state in this case.
>>
>> Cute, however did you find that ?
>
> Hmm,.. could you give a little more detail?
>
> Because if ->on_rq=0, we'll have done dequeue_task() which will have
> done update_curr() with ->on_rq, weight and ->running consistently.
>
> Then the above, inconsistent update should not happen, because delta=0.
In fact, the delta between dequeue_entity_load_avg() and
update_blocked_averages() is not 0 on my platform (hikey) but can be
longer than 60us (at lowest frequency with only 1 task group level)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-04 9:12 ` Vincent Guittot
@ 2017-07-04 9:44 ` Peter Zijlstra
2017-07-04 9:57 ` Vincent Guittot
0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2017-07-04 9:44 UTC (permalink / raw)
To: Vincent Guittot
Cc: Ingo Molnar, linux-kernel, Dietmar Eggemann, Morten Rasmussen
On Tue, Jul 04, 2017 at 11:12:34AM +0200, Vincent Guittot wrote:
> On 4 July 2017 at 10:34, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Tue, Jul 04, 2017 at 09:27:07AM +0200, Peter Zijlstra wrote:
> >> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
> >> > The running state is a subset of runnable state which means that running
> >> > can't be set if runnable (weight) is cleared. There are corner cases
> >> > where the current sched_entity has been already dequeued but cfs_rq->curr
> >> > has not been updated yet and still points to the dequeued sched_entity.
> >> > If ___update_load_avg is called at that time, weight will be 0 and running
> >> > will be set which is not possible.
> >> >
> >> > This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
> >> > The current sched_entity has been dequeued so se->on_rq is cleared and
> >> > cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
> >> > cleared when picking the idle thread). Because the cfs_rq becomes idle,
> >> > idle_balance() is called and ends up to call update_blocked_averages()
> >> > with these wrong running and runnable states.
> >> >
> >> > Add a test in ___update_load_avg to correct the running state in this case.
> >>
> >> Cute, however did you find that ?
> >
> > Hmm,.. could you give a little more detail?
> >
> > Because if ->on_rq=0, we'll have done dequeue_task() which will have
> > done update_curr() with ->on_rq, weight and ->running consistently.
> >
> > Then the above, inconsistent update should not happen, because delta=0.
>
> In fact, the delta between dequeue_entity_load_avg() and
> update_blocked_averages() is not 0 on my platform (hikey) but can be
> longer than 60us (at lowest frequency with only 1 task group level)
But but but, how can that happen? Should it not all be under the same
rq->lock and thus have only a single update_rq_clock() and thus be at
the same 'instant' ?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-04 9:44 ` Peter Zijlstra
@ 2017-07-04 9:57 ` Vincent Guittot
2017-07-04 11:07 ` Peter Zijlstra
0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2017-07-04 9:57 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel, Dietmar Eggemann, Morten Rasmussen
On 4 July 2017 at 11:44, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Jul 04, 2017 at 11:12:34AM +0200, Vincent Guittot wrote:
>> On 4 July 2017 at 10:34, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Tue, Jul 04, 2017 at 09:27:07AM +0200, Peter Zijlstra wrote:
>> >> On Sat, Jul 01, 2017 at 07:06:13AM +0200, Vincent Guittot wrote:
>> >> > The running state is a subset of runnable state which means that running
>> >> > can't be set if runnable (weight) is cleared. There are corner cases
>> >> > where the current sched_entity has been already dequeued but cfs_rq->curr
>> >> > has not been updated yet and still points to the dequeued sched_entity.
>> >> > If ___update_load_avg is called at that time, weight will be 0 and running
>> >> > will be set which is not possible.
>> >> >
>> >> > This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
>> >> > The current sched_entity has been dequeued so se->on_rq is cleared and
>> >> > cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
>> >> > cleared when picking the idle thread). Because the cfs_rq becomes idle,
>> >> > idle_balance() is called and ends up to call update_blocked_averages()
>> >> > with these wrong running and runnable states.
>> >> >
>> >> > Add a test in ___update_load_avg to correct the running state in this case.
>> >>
>> >> Cute, however did you find that ?
>> >
>> > Hmm,.. could you give a little more detail?
>> >
>> > Because if ->on_rq=0, we'll have done dequeue_task() which will have
>> > done update_curr() with ->on_rq, weight and ->running consistently.
>> >
>> > Then the above, inconsistent update should not happen, because delta=0.
>>
>> In fact, the delta between dequeue_entity_load_avg() and
>> update_blocked_averages() is not 0 on my platform (hikey) but can be
>> longer than 60us (at lowest frequency with only 1 task group level)
>
> But but but, how can that happen? Should it not all be under the same
> rq->lock and thus have only a single update_rq_clock() and thus be at
> the same 'instant' ?
idle_balance() unlock rq->lock before calling update_blocked_averages
And update_blocked_averages() starts by calling update_rq_clock()
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] sched/pelt: fix false running accounting
2017-07-04 9:57 ` Vincent Guittot
@ 2017-07-04 11:07 ` Peter Zijlstra
0 siblings, 0 replies; 12+ messages in thread
From: Peter Zijlstra @ 2017-07-04 11:07 UTC (permalink / raw)
To: Vincent Guittot
Cc: Ingo Molnar, linux-kernel, Dietmar Eggemann, Morten Rasmussen
On Tue, Jul 04, 2017 at 11:57:12AM +0200, Vincent Guittot wrote:
> On 4 July 2017 at 11:44, Peter Zijlstra <peterz@infradead.org> wrote:
> > But but but, how can that happen? Should it not all be under the same
> > rq->lock and thus have only a single update_rq_clock() and thus be at
> > the same 'instant' ?
>
> idle_balance() unlock rq->lock before calling update_blocked_averages
> And update_blocked_averages() starts by calling update_rq_clock()
Ah indeed. Might want to clarify that point.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [tip:sched/core] sched/pelt: Fix false running accounting
2017-07-01 5:06 ` [PATCH v2] sched/pelt: fix false running accounting Vincent Guittot
2017-07-04 7:27 ` Peter Zijlstra
@ 2017-08-10 12:07 ` tip-bot for Vincent Guittot
1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Vincent Guittot @ 2017-08-10 12:07 UTC (permalink / raw)
To: linux-tip-commits
Cc: peterz, tglx, mingo, torvalds, hpa, vincent.guittot, linux-kernel
Commit-ID: f235a54f00449c611f85173fe8a66c4d189c5ce1
Gitweb: http://git.kernel.org/tip/f235a54f00449c611f85173fe8a66c4d189c5ce1
Author: Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Sat, 1 Jul 2017 07:06:13 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 10 Aug 2017 12:18:15 +0200
sched/pelt: Fix false running accounting
The running state is a subset of runnable state which means that running
can't be set if runnable (weight) is cleared. There are corner cases
where the current sched_entity has been already dequeued but cfs_rq->curr
has not been updated yet and still points to the dequeued sched_entity.
If ___update_load_avg() is called at that time, weight will be 0 and running
will be set which is not possible.
This case happens during pick_next_task_fair() when a cfs_rq becomes idles.
The current sched_entity has been dequeued so se->on_rq is cleared and
cfs_rq->weight is null. But cfs_rq->curr still points to se (it will be
cleared when picking the idle thread). Because the cfs_rq becomes idle,
idle_balance() is called and ends up to call update_blocked_averages()
with these wrong running and runnable states.
Add a test in ___update_load_avg() to correct the running state in this case.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten.Rasmussen@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Link: http://lkml.kernel.org/r/1498885573-18984-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/fair.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 75c58c7..ef5b66b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2991,6 +2991,18 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
sa->last_update_time += delta << 10;
/*
+ * running is a subset of runnable (weight) so running can't be set if
+ * runnable is clear. But there are some corner cases where the current
+ * se has been already dequeued but cfs_rq->curr still points to it.
+ * This means that weight will be 0 but not running for a sched_entity
+ * but also for a cfs_rq if the latter becomes idle. As an example,
+ * this happens during idle_balance() which calls
+ * update_blocked_averages()
+ */
+ if (!weight)
+ running = 0;
+
+ /*
* Now we know we crossed measurement unit boundaries. The *_avg
* accrues by two steps:
*
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-08-10 12:10 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-30 13:58 [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
2017-07-01 5:06 ` [PATCH v2] sched/pelt: fix false running accounting Vincent Guittot
2017-07-04 7:27 ` Peter Zijlstra
2017-07-04 7:37 ` Vincent Guittot
2017-07-04 8:34 ` Peter Zijlstra
2017-07-04 9:12 ` Vincent Guittot
2017-07-04 9:44 ` Peter Zijlstra
2017-07-04 9:57 ` Vincent Guittot
2017-07-04 11:07 ` Peter Zijlstra
2017-08-10 12:07 ` [tip:sched/core] sched/pelt: Fix " tip-bot for Vincent Guittot
2017-07-01 5:09 ` [PATCH] sched/pelt: fix false running accounting in PELT Vincent Guittot
2017-07-01 16:57 ` kbuild test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).