All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-14  9:02 ` Kuyo Chang
  0 siblings, 0 replies; 10+ messages in thread
From: Kuyo Chang @ 2022-04-14  9:02 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger
  Cc: wsd_upstream, kuyo chang, Ingo Molnar, linux-kernel,
	linux-arm-kernel, linux-mediatek

From: kuyo chang <kuyo.chang@mediatek.com>

I meet the warning message at cfs_rq_is_decayed at below code.

SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		    cfs_rq->avg.util_avg ||
		    cfs_rq->avg.runnable_avg)

Following is the calltrace.

Call trace:
__update_blocked_fair
update_blocked_averages
newidle_balance
pick_next_task_fair
__schedule
schedule
pipe_read
vfs_read
ksys_read

After code analyzing and some debug messages, I found it exits a corner
case at attach_entity_load_avg which will cause load_sum is null but
load_avg is not.
Consider se_weight is 88761 according by sched_prio_to_weight table.
And assume the get_pelt_divider() is 47742, se->avg.load_avg is 1.
By the calculating for se->avg.load_sum as following will become zero
as following.
se->avg.load_sum =
	div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
se->avg.load_sum = 1*47742/88761 = 0.

After enqueue_load_avg code as below.
cfs_rq->avg.load_avg += se->avg.load_avg;
cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Then the load_sum for cfs_rq will be 1 while the load_sum for cfs_rq is 0.
So it will hit the warning message.

In order to fix the corner case, make sure the se->load_avg|sum is correct
before enqueue_load_avg.

After long time testing, the kernel warning was gone and the system runs
as well as before.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
---

 v1->v2:
 (1)Thanks for suggestion from Peter Zijlstra & Vincent Guittot.
 (2)By suggestion from Vincent Guittot,
 rework the se->load_sum calculation method for fix the corner case,
 make sure the se->load_avg|sum is correct before enqueue_load_avg.
 (3)Rework changlog.

 v2->v3:
 (1)Rename Subject. 
 (1)Add fix tag.


 kernel/sched/fair.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4bd299d67ab..159274482c4e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum) {
 		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
+			div_u64(se->avg.load_sum, se_weight(se));
+	} else {
+		se->avg.load_sum = 1;
 	}
 
 	enqueue_load_avg(cfs_rq, se);
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-14  9:02 ` Kuyo Chang
  0 siblings, 0 replies; 10+ messages in thread
From: Kuyo Chang @ 2022-04-14  9:02 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger
  Cc: wsd_upstream, kuyo chang, Ingo Molnar, linux-kernel,
	linux-arm-kernel, linux-mediatek

From: kuyo chang <kuyo.chang@mediatek.com>

I meet the warning message at cfs_rq_is_decayed at below code.

SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		    cfs_rq->avg.util_avg ||
		    cfs_rq->avg.runnable_avg)

Following is the calltrace.

Call trace:
__update_blocked_fair
update_blocked_averages
newidle_balance
pick_next_task_fair
__schedule
schedule
pipe_read
vfs_read
ksys_read

After code analyzing and some debug messages, I found it exits a corner
case at attach_entity_load_avg which will cause load_sum is null but
load_avg is not.
Consider se_weight is 88761 according by sched_prio_to_weight table.
And assume the get_pelt_divider() is 47742, se->avg.load_avg is 1.
By the calculating for se->avg.load_sum as following will become zero
as following.
se->avg.load_sum =
	div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
se->avg.load_sum = 1*47742/88761 = 0.

After enqueue_load_avg code as below.
cfs_rq->avg.load_avg += se->avg.load_avg;
cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Then the load_sum for cfs_rq will be 1 while the load_sum for cfs_rq is 0.
So it will hit the warning message.

In order to fix the corner case, make sure the se->load_avg|sum is correct
before enqueue_load_avg.

After long time testing, the kernel warning was gone and the system runs
as well as before.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
---

 v1->v2:
 (1)Thanks for suggestion from Peter Zijlstra & Vincent Guittot.
 (2)By suggestion from Vincent Guittot,
 rework the se->load_sum calculation method for fix the corner case,
 make sure the se->load_avg|sum is correct before enqueue_load_avg.
 (3)Rework changlog.

 v2->v3:
 (1)Rename Subject. 
 (1)Add fix tag.


 kernel/sched/fair.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4bd299d67ab..159274482c4e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum) {
 		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
+			div_u64(se->avg.load_sum, se_weight(se));
+	} else {
+		se->avg.load_sum = 1;
 	}
 
 	enqueue_load_avg(cfs_rq, se);
-- 
2.18.0


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-14  9:02 ` Kuyo Chang
  0 siblings, 0 replies; 10+ messages in thread
From: Kuyo Chang @ 2022-04-14  9:02 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger
  Cc: wsd_upstream, kuyo chang, Ingo Molnar, linux-kernel,
	linux-arm-kernel, linux-mediatek

From: kuyo chang <kuyo.chang@mediatek.com>

I meet the warning message at cfs_rq_is_decayed at below code.

SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		    cfs_rq->avg.util_avg ||
		    cfs_rq->avg.runnable_avg)

Following is the calltrace.

Call trace:
__update_blocked_fair
update_blocked_averages
newidle_balance
pick_next_task_fair
__schedule
schedule
pipe_read
vfs_read
ksys_read

After code analyzing and some debug messages, I found it exits a corner
case at attach_entity_load_avg which will cause load_sum is null but
load_avg is not.
Consider se_weight is 88761 according by sched_prio_to_weight table.
And assume the get_pelt_divider() is 47742, se->avg.load_avg is 1.
By the calculating for se->avg.load_sum as following will become zero
as following.
se->avg.load_sum =
	div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
se->avg.load_sum = 1*47742/88761 = 0.

After enqueue_load_avg code as below.
cfs_rq->avg.load_avg += se->avg.load_avg;
cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Then the load_sum for cfs_rq will be 1 while the load_sum for cfs_rq is 0.
So it will hit the warning message.

In order to fix the corner case, make sure the se->load_avg|sum is correct
before enqueue_load_avg.

After long time testing, the kernel warning was gone and the system runs
as well as before.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
---

 v1->v2:
 (1)Thanks for suggestion from Peter Zijlstra & Vincent Guittot.
 (2)By suggestion from Vincent Guittot,
 rework the se->load_sum calculation method for fix the corner case,
 make sure the se->load_avg|sum is correct before enqueue_load_avg.
 (3)Rework changlog.

 v2->v3:
 (1)Rename Subject. 
 (1)Add fix tag.


 kernel/sched/fair.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4bd299d67ab..159274482c4e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum) {
 		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
+			div_u64(se->avg.load_sum, se_weight(se));
+	} else {
+		se->avg.load_sum = 1;
 	}
 
 	enqueue_load_avg(cfs_rq, se);
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
  2022-04-14  9:02 ` Kuyo Chang
  (?)
@ 2022-04-14 14:44   ` Peter Zijlstra
  -1 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-04-14 14:44 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger, wsd_upstream,
	Ingo Molnar, linux-kernel, linux-arm-kernel, linux-mediatek


I've taken the liberty of carrying over the tags from v2 and reworked
the Changelog a little.

---
Subject: sched/pelt: Fix attach_entity_load_avg() corner case
From: kuyo chang <kuyo.chang@mediatek.com>
Date: Thu, 14 Apr 2022 17:02:20 +0800

From: kuyo chang <kuyo.chang@mediatek.com>

The warning in cfs_rq_is_decayed() triggered:

    SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		  cfs_rq->avg.util_avg ||
		  cfs_rq->avg.runnable_avg)

There exists a corner case in attach_entity_load_avg() which will
cause load_sum to be zero while load_avg will not be.

Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
Further assume the get_pelt_divider() is 47742, this gives:
se->avg.load_avg is 1.

However, calculating load_sum results in 0:

  se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
  se->avg.load_sum = 1*47742/88761 = 0.

Then enqueue_load_avg() adds this to the cfs_rq totals:

  cfs_rq->avg.load_avg += se->avg.load_avg;
  cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Resulting in load_avg being 1 with load_sum is 0, which will trigger
the WARN.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
[peterz: massage changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com
---
 kernel/sched/fair.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
-		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
-	}
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum)
+		se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se));
+	else
+		se->avg.load_sum = 1;
 
 	enqueue_load_avg(cfs_rq, se);
 	cfs_rq->avg.util_avg += se->avg.util_avg;

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-14 14:44   ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-04-14 14:44 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger, wsd_upstream,
	Ingo Molnar, linux-kernel, linux-arm-kernel, linux-mediatek


I've taken the liberty of carrying over the tags from v2 and reworked
the Changelog a little.

---
Subject: sched/pelt: Fix attach_entity_load_avg() corner case
From: kuyo chang <kuyo.chang@mediatek.com>
Date: Thu, 14 Apr 2022 17:02:20 +0800

From: kuyo chang <kuyo.chang@mediatek.com>

The warning in cfs_rq_is_decayed() triggered:

    SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		  cfs_rq->avg.util_avg ||
		  cfs_rq->avg.runnable_avg)

There exists a corner case in attach_entity_load_avg() which will
cause load_sum to be zero while load_avg will not be.

Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
Further assume the get_pelt_divider() is 47742, this gives:
se->avg.load_avg is 1.

However, calculating load_sum results in 0:

  se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
  se->avg.load_sum = 1*47742/88761 = 0.

Then enqueue_load_avg() adds this to the cfs_rq totals:

  cfs_rq->avg.load_avg += se->avg.load_avg;
  cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Resulting in load_avg being 1 with load_sum is 0, which will trigger
the WARN.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
[peterz: massage changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com
---
 kernel/sched/fair.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
-		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
-	}
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum)
+		se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se));
+	else
+		se->avg.load_sum = 1;
 
 	enqueue_load_avg(cfs_rq, se);
 	cfs_rq->avg.util_avg += se->avg.util_avg;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-14 14:44   ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-04-14 14:44 UTC (permalink / raw)
  To: Kuyo Chang
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger, wsd_upstream,
	Ingo Molnar, linux-kernel, linux-arm-kernel, linux-mediatek


I've taken the liberty of carrying over the tags from v2 and reworked
the Changelog a little.

---
Subject: sched/pelt: Fix attach_entity_load_avg() corner case
From: kuyo chang <kuyo.chang@mediatek.com>
Date: Thu, 14 Apr 2022 17:02:20 +0800

From: kuyo chang <kuyo.chang@mediatek.com>

The warning in cfs_rq_is_decayed() triggered:

    SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		  cfs_rq->avg.util_avg ||
		  cfs_rq->avg.runnable_avg)

There exists a corner case in attach_entity_load_avg() which will
cause load_sum to be zero while load_avg will not be.

Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
Further assume the get_pelt_divider() is 47742, this gives:
se->avg.load_avg is 1.

However, calculating load_sum results in 0:

  se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
  se->avg.load_sum = 1*47742/88761 = 0.

Then enqueue_load_avg() adds this to the cfs_rq totals:

  cfs_rq->avg.load_avg += se->avg.load_avg;
  cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Resulting in load_avg being 1 with load_sum is 0, which will trigger
the WARN.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
[peterz: massage changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com
---
 kernel/sched/fair.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
-		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
-	}
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum)
+		se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se));
+	else
+		se->avg.load_sum = 1;
 
 	enqueue_load_avg(cfs_rq, se);
 	cfs_rq->avg.util_avg += se->avg.util_avg;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
  2022-04-14 14:44   ` Peter Zijlstra
  (?)
@ 2022-04-15  3:56     ` Kuyo Chang
  -1 siblings, 0 replies; 10+ messages in thread
From: Kuyo Chang @ 2022-04-15  3:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger, wsd_upstream,
	Ingo Molnar, linux-kernel, linux-arm-kernel, linux-mediatek

On Thu, 2022-04-14 at 16:44 +0200, Peter Zijlstra wrote:
> I've taken the liberty of carrying over the tags from v2 and reworked
> the Changelog a little.

Thank you for all your assistance.

> ---
> Subject: sched/pelt: Fix attach_entity_load_avg() corner case
> From: kuyo chang <kuyo.chang@mediatek.com>
> Date: Thu, 14 Apr 2022 17:02:20 +0800
> 
> From: kuyo chang <kuyo.chang@mediatek.com>
> 
> The warning in cfs_rq_is_decayed() triggered:
> 
>     SCHED_WARN_ON(cfs_rq->avg.load_avg ||
> 		  cfs_rq->avg.util_avg ||
> 		  cfs_rq->avg.runnable_avg)
> 
> There exists a corner case in attach_entity_load_avg() which will
> cause load_sum to be zero while load_avg will not be.
> 
> Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
> Further assume the get_pelt_divider() is 47742, this gives:
> se->avg.load_avg is 1.
> 
> However, calculating load_sum results in 0:
> 
>   se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
>   se->avg.load_sum = 1*47742/88761 = 0.
> 
> Then enqueue_load_avg() adds this to the cfs_rq totals:
> 
>   cfs_rq->avg.load_avg += se->avg.load_avg;
>   cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;
> 
> Resulting in load_avg being 1 with load_sum is 0, which will trigger
> the WARN.
> 
> Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq
> and its se")
> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> [peterz: massage changelog]
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Link: 
> https://urldefense.com/v3/__https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com__;!!CTRNKA9wMg0ARbw!35Im02xxIuUZdLYpPng37Yk7oVNJVJ1tfbu4XRzlq-6VhH3K29Por0gJCFlslT_CMgA$
>  
> ---
>  kernel/sched/fair.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
>  
>  	se->avg.runnable_sum = se->avg.runnable_avg * divider;
>  
> -	se->avg.load_sum = divider;
> -	if (se_weight(se)) {
> -		se->avg.load_sum =
> -			div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
> -	}
> +	se->avg.load_sum = se->avg.load_avg * divider;
> +	if (se_weight(se) < se->avg.load_sum)
> +		se->avg.load_sum = div_u64(se->avg.load_sum,
> se_weight(se));
> +	else
> +		se->avg.load_sum = 1;
>  
>  	enqueue_load_avg(cfs_rq, se);
>  	cfs_rq->avg.util_avg += se->avg.util_avg;


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-15  3:56     ` Kuyo Chang
  0 siblings, 0 replies; 10+ messages in thread
From: Kuyo Chang @ 2022-04-15  3:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger, wsd_upstream,
	Ingo Molnar, linux-kernel, linux-arm-kernel, linux-mediatek

On Thu, 2022-04-14 at 16:44 +0200, Peter Zijlstra wrote:
> I've taken the liberty of carrying over the tags from v2 and reworked
> the Changelog a little.

Thank you for all your assistance.

> ---
> Subject: sched/pelt: Fix attach_entity_load_avg() corner case
> From: kuyo chang <kuyo.chang@mediatek.com>
> Date: Thu, 14 Apr 2022 17:02:20 +0800
> 
> From: kuyo chang <kuyo.chang@mediatek.com>
> 
> The warning in cfs_rq_is_decayed() triggered:
> 
>     SCHED_WARN_ON(cfs_rq->avg.load_avg ||
> 		  cfs_rq->avg.util_avg ||
> 		  cfs_rq->avg.runnable_avg)
> 
> There exists a corner case in attach_entity_load_avg() which will
> cause load_sum to be zero while load_avg will not be.
> 
> Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
> Further assume the get_pelt_divider() is 47742, this gives:
> se->avg.load_avg is 1.
> 
> However, calculating load_sum results in 0:
> 
>   se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
>   se->avg.load_sum = 1*47742/88761 = 0.
> 
> Then enqueue_load_avg() adds this to the cfs_rq totals:
> 
>   cfs_rq->avg.load_avg += se->avg.load_avg;
>   cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;
> 
> Resulting in load_avg being 1 with load_sum is 0, which will trigger
> the WARN.
> 
> Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq
> and its se")
> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> [peterz: massage changelog]
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Link: 
> https://urldefense.com/v3/__https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com__;!!CTRNKA9wMg0ARbw!35Im02xxIuUZdLYpPng37Yk7oVNJVJ1tfbu4XRzlq-6VhH3K29Por0gJCFlslT_CMgA$
>  
> ---
>  kernel/sched/fair.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
>  
>  	se->avg.runnable_sum = se->avg.runnable_avg * divider;
>  
> -	se->avg.load_sum = divider;
> -	if (se_weight(se)) {
> -		se->avg.load_sum =
> -			div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
> -	}
> +	se->avg.load_sum = se->avg.load_avg * divider;
> +	if (se_weight(se) < se->avg.load_sum)
> +		se->avg.load_sum = div_u64(se->avg.load_sum,
> se_weight(se));
> +	else
> +		se->avg.load_sum = 1;
>  
>  	enqueue_load_avg(cfs_rq, se);
>  	cfs_rq->avg.util_avg += se->avg.util_avg;


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method
@ 2022-04-15  3:56     ` Kuyo Chang
  0 siblings, 0 replies; 10+ messages in thread
From: Kuyo Chang @ 2022-04-15  3:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Matthias Brugger, wsd_upstream,
	Ingo Molnar, linux-kernel, linux-arm-kernel, linux-mediatek

On Thu, 2022-04-14 at 16:44 +0200, Peter Zijlstra wrote:
> I've taken the liberty of carrying over the tags from v2 and reworked
> the Changelog a little.

Thank you for all your assistance.

> ---
> Subject: sched/pelt: Fix attach_entity_load_avg() corner case
> From: kuyo chang <kuyo.chang@mediatek.com>
> Date: Thu, 14 Apr 2022 17:02:20 +0800
> 
> From: kuyo chang <kuyo.chang@mediatek.com>
> 
> The warning in cfs_rq_is_decayed() triggered:
> 
>     SCHED_WARN_ON(cfs_rq->avg.load_avg ||
> 		  cfs_rq->avg.util_avg ||
> 		  cfs_rq->avg.runnable_avg)
> 
> There exists a corner case in attach_entity_load_avg() which will
> cause load_sum to be zero while load_avg will not be.
> 
> Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
> Further assume the get_pelt_divider() is 47742, this gives:
> se->avg.load_avg is 1.
> 
> However, calculating load_sum results in 0:
> 
>   se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
>   se->avg.load_sum = 1*47742/88761 = 0.
> 
> Then enqueue_load_avg() adds this to the cfs_rq totals:
> 
>   cfs_rq->avg.load_avg += se->avg.load_avg;
>   cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;
> 
> Resulting in load_avg being 1 with load_sum is 0, which will trigger
> the WARN.
> 
> Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq
> and its se")
> Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
> [peterz: massage changelog]
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Link: 
> https://urldefense.com/v3/__https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com__;!!CTRNKA9wMg0ARbw!35Im02xxIuUZdLYpPng37Yk7oVNJVJ1tfbu4XRzlq-6VhH3K29Por0gJCFlslT_CMgA$
>  
> ---
>  kernel/sched/fair.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc
>  
>  	se->avg.runnable_sum = se->avg.runnable_avg * divider;
>  
> -	se->avg.load_sum = divider;
> -	if (se_weight(se)) {
> -		se->avg.load_sum =
> -			div_u64(se->avg.load_avg * se->avg.load_sum,
> se_weight(se));
> -	}
> +	se->avg.load_sum = se->avg.load_avg * divider;
> +	if (se_weight(se) < se->avg.load_sum)
> +		se->avg.load_sum = div_u64(se->avg.load_sum,
> se_weight(se));
> +	else
> +		se->avg.load_sum = 1;
>  
>  	enqueue_load_avg(cfs_rq, se);
>  	cfs_rq->avg.util_avg += se->avg.util_avg;


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip: sched/urgent] sched/pelt: Fix attach_entity_load_avg() corner case
  2022-04-14  9:02 ` Kuyo Chang
                   ` (2 preceding siblings ...)
  (?)
@ 2022-04-19 19:35 ` tip-bot2 for kuyo chang
  -1 siblings, 0 replies; 10+ messages in thread
From: tip-bot2 for kuyo chang @ 2022-04-19 19:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kuyo chang, Peter Zijlstra (Intel),
	Vincent Guittot, Dietmar Eggemann, x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     40f5aa4c5eaebfeaca4566217cb9c468e28ed682
Gitweb:        https://git.kernel.org/tip/40f5aa4c5eaebfeaca4566217cb9c468e28ed682
Author:        kuyo chang <kuyo.chang@mediatek.com>
AuthorDate:    Thu, 14 Apr 2022 17:02:20 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 19 Apr 2022 21:15:41 +02:00

sched/pelt: Fix attach_entity_load_avg() corner case

The warning in cfs_rq_is_decayed() triggered:

    SCHED_WARN_ON(cfs_rq->avg.load_avg ||
		  cfs_rq->avg.util_avg ||
		  cfs_rq->avg.runnable_avg)

There exists a corner case in attach_entity_load_avg() which will
cause load_sum to be zero while load_avg will not be.

Consider se_weight is 88761 as per the sched_prio_to_weight[] table.
Further assume the get_pelt_divider() is 47742, this gives:
se->avg.load_avg is 1.

However, calculating load_sum:

  se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
  se->avg.load_sum = 1*47742/88761 = 0.

Then enqueue_load_avg() adds this to the cfs_rq totals:

  cfs_rq->avg.load_avg += se->avg.load_avg;
  cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum;

Resulting in load_avg being 1 with load_sum is 0, which will trigger
the WARN.

Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se")
Signed-off-by: kuyo chang <kuyo.chang@mediatek.com>
[peterz: massage changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com
---
 kernel/sched/fair.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4bd299..a68482d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 
 	se->avg.runnable_sum = se->avg.runnable_avg * divider;
 
-	se->avg.load_sum = divider;
-	if (se_weight(se)) {
-		se->avg.load_sum =
-			div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
-	}
+	se->avg.load_sum = se->avg.load_avg * divider;
+	if (se_weight(se) < se->avg.load_sum)
+		se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se));
+	else
+		se->avg.load_sum = 1;
 
 	enqueue_load_avg(cfs_rq, se);
 	cfs_rq->avg.util_avg += se->avg.util_avg;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-19 19:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-14  9:02 [PATCH 1/1] [PATCH v3]sched/pelt: Fix the attach_entity_load_avg calculate method Kuyo Chang
2022-04-14  9:02 ` Kuyo Chang
2022-04-14  9:02 ` Kuyo Chang
2022-04-14 14:44 ` Peter Zijlstra
2022-04-14 14:44   ` Peter Zijlstra
2022-04-14 14:44   ` Peter Zijlstra
2022-04-15  3:56   ` Kuyo Chang
2022-04-15  3:56     ` Kuyo Chang
2022-04-15  3:56     ` Kuyo Chang
2022-04-19 19:35 ` [tip: sched/urgent] sched/pelt: Fix attach_entity_load_avg() corner case tip-bot2 for kuyo chang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.