LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3] sched/pelt: fix update_blocked_averages() for dl and rt
@ 2018-08-31 15:22 Vincent Guittot
  2018-09-03  4:36 ` Dietmar Eggemann
  2018-09-10 10:06 ` [tip:sched/core] sched/pelt: Fix update_blocked_averages() for RT and DL classes tip-bot for Vincent Guittot
  0 siblings, 2 replies; 3+ messages in thread
From: Vincent Guittot @ 2018-08-31 15:22 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel; +Cc: Vincent Guittot

update_blocked_averages() is called to periodiccally decay the stalled load
of idle CPUs and to sync all loads before running load balance.

When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
in order to potentially pull tasks and to use this newly idle CPU. This
load balance happens whereas prev task from another class has not been put
and its utilization updated yet. This may lead to wrongly account running
time as idle time for rt or dl classes.

Test that no rt or dl task is running when updating their utilization in
update_blocked_averages().

We still update rt and dl utilization instead of simply skipping them to
make sure that all metrics are synced when used during load balance.

Fixes: 371bf4273269 ("sched/rt: Add rt_rq utilization tracking")
Fixes: 3727e0e16340 ("sched/dl: Add dl_rq utilization tracking")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---

-V3
  - move rq->curr dereference under the rq->lock

-V2 
  - Add missing fixes tags
  - apply fix to other version of update_blocked_averages

 kernel/sched/fair.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 309c93f..53bbcd4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7262,6 +7262,7 @@ static void update_blocked_averages(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 	struct cfs_rq *cfs_rq, *pos;
+	const struct sched_class *curr_class;
 	struct rq_flags rf;
 	bool done = true;
 
@@ -7298,8 +7299,10 @@ static void update_blocked_averages(int cpu)
 		if (cfs_rq_has_blocked(cfs_rq))
 			done = false;
 	}
-	update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
-	update_dl_rq_load_avg(rq_clock_task(rq), rq, 0);
+
+	curr_class = rq->curr->sched_class;
+	update_rt_rq_load_avg(rq_clock_task(rq), rq, curr_class == &rt_sched_class);
+	update_dl_rq_load_avg(rq_clock_task(rq), rq, curr_class == &dl_sched_class);
 	update_irq_load_avg(rq, 0);
 	/* Don't need periodic decay once load/util_avg are null */
 	if (others_have_blocked(rq))
@@ -7364,13 +7367,16 @@ static inline void update_blocked_averages(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 	struct cfs_rq *cfs_rq = &rq->cfs;
+	const struct sched_class *curr_class;
 	struct rq_flags rf;
 
 	rq_lock_irqsave(rq, &rf);
 	update_rq_clock(rq);
 	update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
-	update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
-	update_dl_rq_load_avg(rq_clock_task(rq), rq, 0);
+
+	curr_class = rq->curr->sched_class;
+	update_rt_rq_load_avg(rq_clock_task(rq), rq, curr_class == &rt_sched_class);
+	update_dl_rq_load_avg(rq_clock_task(rq), rq, curr_class == &dl_sched_class);
 	update_irq_load_avg(rq, 0);
 #ifdef CONFIG_NO_HZ_COMMON
 	rq->last_blocked_load_update_tick = jiffies;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] sched/pelt: fix update_blocked_averages() for dl and rt
  2018-08-31 15:22 [PATCH v3] sched/pelt: fix update_blocked_averages() for dl and rt Vincent Guittot
@ 2018-09-03  4:36 ` Dietmar Eggemann
  2018-09-10 10:06 ` [tip:sched/core] sched/pelt: Fix update_blocked_averages() for RT and DL classes tip-bot for Vincent Guittot
  1 sibling, 0 replies; 3+ messages in thread
From: Dietmar Eggemann @ 2018-09-03  4:36 UTC (permalink / raw)
  To: Vincent Guittot, peterz, mingo, linux-kernel

On 08/31/2018 08:22 AM, Vincent Guittot wrote:
> update_blocked_averages() is called to periodiccally decay the stalled load
> of idle CPUs and to sync all loads before running load balance.
> 
> When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
> in order to potentially pull tasks and to use this newly idle CPU. This
> load balance happens whereas prev task from another class has not been put
> and its utilization updated yet. This may lead to wrongly account running
> time as idle time for rt or dl classes.
> 
> Test that no rt or dl task is running when updating their utilization in
> update_blocked_averages().

Shouldn't this be 's/that no/if an' ?

You still update the utilization of the rt or dl task if they are 
running (accrue + decay) instead of only decay. Similar to the 
'cfs_rq->curr != NULL' in __update_load_avg_cfs_rq().

> We still update rt and dl utilization instead of simply skipping them to
> make sure that all metrics are synced when used during load balance.

I assume this sentence is indirectly saying this.

> 
> Fixes: 371bf4273269 ("sched/rt: Add rt_rq utilization tracking")
> Fixes: 3727e0e16340 ("sched/dl: Add dl_rq utilization tracking")
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> 
> -V3
>    - move rq->curr dereference under the rq->lock
> 
> -V2
>    - Add missing fixes tags
>    - apply fix to other version of update_blocked_averages
> 
>   kernel/sched/fair.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 309c93f..53bbcd4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7262,6 +7262,7 @@ static void update_blocked_averages(int cpu)
>   {
>   	struct rq *rq = cpu_rq(cpu);
>   	struct cfs_rq *cfs_rq, *pos;
> +	const struct sched_class *curr_class;
>   	struct rq_flags rf;
>   	bool done = true;
>   
> @@ -7298,8 +7299,10 @@ static void update_blocked_averages(int cpu)
>   		if (cfs_rq_has_blocked(cfs_rq))
>   			done = false;
>   	}
> -	update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
> -	update_dl_rq_load_avg(rq_clock_task(rq), rq, 0);
> +
> +	curr_class = rq->curr->sched_class;
> +	update_rt_rq_load_avg(rq_clock_task(rq), rq, curr_class == &rt_sched_class);
> +	update_dl_rq_load_avg(rq_clock_task(rq), rq, curr_class == &dl_sched_class);
>   	update_irq_load_avg(rq, 0);
>   	/* Don't need periodic decay once load/util_avg are null */
>   	if (others_have_blocked(rq))
> @@ -7364,13 +7367,16 @@ static inline void update_blocked_averages(int cpu)
>   {
>   	struct rq *rq = cpu_rq(cpu);
>   	struct cfs_rq *cfs_rq = &rq->cfs;
> +	const struct sched_class *curr_class;
>   	struct rq_flags rf;
>   
>   	rq_lock_irqsave(rq, &rf);
>   	update_rq_clock(rq);
>   	update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
> -	update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
> -	update_dl_rq_load_avg(rq_clock_task(rq), rq, 0);
> +
> +	curr_class = rq->curr->sched_class;
> +	update_rt_rq_load_avg(rq_clock_task(rq), rq, curr_class == &rt_sched_class);
> +	update_dl_rq_load_avg(rq_clock_task(rq), rq, curr_class == &dl_sched_class);
>   	update_irq_load_avg(rq, 0);
>   #ifdef CONFIG_NO_HZ_COMMON
>   	rq->last_blocked_load_update_tick = jiffies;
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip:sched/core] sched/pelt: Fix update_blocked_averages() for RT and DL classes
  2018-08-31 15:22 [PATCH v3] sched/pelt: fix update_blocked_averages() for dl and rt Vincent Guittot
  2018-09-03  4:36 ` Dietmar Eggemann
@ 2018-09-10 10:06 ` tip-bot for Vincent Guittot
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot for Vincent Guittot @ 2018-09-10 10:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: vincent.guittot, hpa, torvalds, linux-kernel, peterz, mingo, tglx

Commit-ID:  12b04875d666e83d27511df25580de84505bc758
Gitweb:     https://git.kernel.org/tip/12b04875d666e83d27511df25580de84505bc758
Author:     Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate: Fri, 31 Aug 2018 17:22:55 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 10 Sep 2018 10:13:46 +0200

sched/pelt: Fix update_blocked_averages() for RT and DL classes

update_blocked_averages() is called to periodiccally decay the stalled load
of idle CPUs and to sync all loads before running load balance.

When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
in order to potentially pull tasks and to use this newly idle CPU. This
load balance happens whereas prev task from another class has not been put
and its utilization updated yet. This may lead to wrongly account running
time as idle time for RT or DL classes.

Test that no RT or DL task is running when updating their utilization in
update_blocked_averages().

We still update RT and DL utilization instead of simply skipping them to
make sure that all metrics are synced when used during load balance.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 371bf4273269 ("sched/rt: Add rt_rq utilization tracking")
Fixes: 3727e0e16340 ("sched/dl: Add dl_rq utilization tracking")
Link: http://lkml.kernel.org/r/1535728975-22799-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b39fb596f6c1..8cff8d55ee95 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7263,6 +7263,7 @@ static void update_blocked_averages(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 	struct cfs_rq *cfs_rq, *pos;
+	const struct sched_class *curr_class;
 	struct rq_flags rf;
 	bool done = true;
 
@@ -7299,8 +7300,10 @@ static void update_blocked_averages(int cpu)
 		if (cfs_rq_has_blocked(cfs_rq))
 			done = false;
 	}
-	update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
-	update_dl_rq_load_avg(rq_clock_task(rq), rq, 0);
+
+	curr_class = rq->curr->sched_class;
+	update_rt_rq_load_avg(rq_clock_task(rq), rq, curr_class == &rt_sched_class);
+	update_dl_rq_load_avg(rq_clock_task(rq), rq, curr_class == &dl_sched_class);
 	update_irq_load_avg(rq, 0);
 	/* Don't need periodic decay once load/util_avg are null */
 	if (others_have_blocked(rq))
@@ -7365,13 +7368,16 @@ static inline void update_blocked_averages(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 	struct cfs_rq *cfs_rq = &rq->cfs;
+	const struct sched_class *curr_class;
 	struct rq_flags rf;
 
 	rq_lock_irqsave(rq, &rf);
 	update_rq_clock(rq);
 	update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
-	update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
-	update_dl_rq_load_avg(rq_clock_task(rq), rq, 0);
+
+	curr_class = rq->curr->sched_class;
+	update_rt_rq_load_avg(rq_clock_task(rq), rq, curr_class == &rt_sched_class);
+	update_dl_rq_load_avg(rq_clock_task(rq), rq, curr_class == &dl_sched_class);
 	update_irq_load_avg(rq, 0);
 #ifdef CONFIG_NO_HZ_COMMON
 	rq->last_blocked_load_update_tick = jiffies;

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-31 15:22 [PATCH v3] sched/pelt: fix update_blocked_averages() for dl and rt Vincent Guittot
2018-09-03  4:36 ` Dietmar Eggemann
2018-09-10 10:06 ` [tip:sched/core] sched/pelt: Fix update_blocked_averages() for RT and DL classes tip-bot for Vincent Guittot

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git