* [PATCH v2 0/3] sched/deadline: Fix double accounting in push_dl_task() & some cleanups
@ 2019-08-02 14:59 Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 1/3] sched/deadline: Fix double accounting of rq/running bw in push & pull Dietmar Eggemann
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Dietmar Eggemann @ 2019-08-02 14:59 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Juri Lelli
Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
Qais Yousef, linux-kernel
While running a simple DL workload (1 DL (12000/100000/100000) task
per CPU) on Arm64 & x86 systems I noticed that some of the
SCHED_WARN_ON() in the rq/running bandwidth (bw) functions trigger.
Patch 1/3 contains a proposal to fix this.
Patch 2-3/3 contain smaller cleanups I discovered while
debugging the actual issue.
Changes v1->v2:
- Remove rq/running bw accounting in pull_dl_task() as well [1/3]
- Remove v1's "sched/deadline: Remove unused int flags from
__dequeue_task_dl()"
- Change return to BUG_ON() in dequeue in !on_dl_rq case [3/3]
- Remove v1's "sched/deadline: Use return value of SCHED_WARN_ON()
in bw accounting"
Dietmar Eggemann (3):
sched/deadline: Fix double accounting of rq/running bw in push & pull
sched/deadline: Use __sub_running_bw() throughout
dl_change_utilization()
sched/deadline: Cleanup on_dl_rq() handling
kernel/sched/deadline.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/3] sched/deadline: Fix double accounting of rq/running bw in push & pull
2019-08-02 14:59 [PATCH v2 0/3] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
@ 2019-08-02 14:59 ` Dietmar Eggemann
2019-08-06 11:19 ` [tip:sched/urgent] " tip-bot for Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 2/3] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 3/3] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
2 siblings, 1 reply; 5+ messages in thread
From: Dietmar Eggemann @ 2019-08-02 14:59 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Juri Lelli
Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
Qais Yousef, linux-kernel
[push/pull]_dl_task() always call deactivate_task() with flags=0 which
sets p->on_rq=TASK_ON_RQ_MIGRATING.
[push/pull]_dl_task()->deactivate_task()->dequeue_task()->
dequeue_task_dl() calls sub_[running/rq]_bw() since
p->on_rq=TASK_ON_RQ_MIGRATING.
So sub_[running/rq]_bw() in [push/pull]_dl_task() is double-accounting
for that task.
The same goes for add_[rq/running]_bw() and activate_task().
[push/pull]_dl_task()->activate_task()->enqueue_task()->
enqueue_task_dl() calls add_[rq/running]_bw() again since p->on_rq is
still set to TASK_ON_RQ_MIGRATING.
So the add_[rq/running]_bw() in enqueue_task_dl() is double-accounting
for that task.
Fix it by removing rq/running bw accounting in [push/pull]_dl_task().
Trace (CONFIG_SCHED_DEBUG=y) before the fix on a 6 CPUs system with 6
DL (12000/100000/100000) tasks showing the issue:
[ 48.147868] dl_rq->running_bw > old
[ 48.147886] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:98
...
[ 48.274832] inactive_task_timer+0x468/0x4e8
[ 48.279057] __hrtimer_run_queues+0x10c/0x3b8
[ 48.283364] hrtimer_interrupt+0xd4/0x250
[ 48.287330] tick_handle_oneshot_broadcast+0x198/0x1d0
...
[ 48.360057] dl_rq->running_bw > dl_rq->this_bw
[ 48.360065] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:86
...
[ 48.488294] task_contending+0x1a0/0x208
[ 48.492172] enqueue_task_dl+0x3b8/0x970
[ 48.496050] activate_task+0x70/0xd0
[ 48.499584] ttwu_do_activate+0x50/0x78
[ 48.503375] try_to_wake_up+0x270/0x7a0
[ 48.507167] wake_up_process+0x14/0x20
[ 48.510873] hrtimer_wakeup+0x1c/0x30
...
[ 50.062867] dl_rq->this_bw > old
[ 50.062885] WARNING: CPU: 1 PID: 2048 at kernel/sched/deadline.c:122
...
[ 50.190520] dequeue_task_dl+0x1e4/0x1f8
[ 50.194400] __sched_setscheduler+0x1d0/0x860
[ 50.198707] _sched_setscheduler+0x74/0x98
[ 50.202757] do_sched_setscheduler+0xa8/0x110
[ 50.207065] __arm64_sys_sched_setscheduler+0x1c/0x30
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Fixes: 7dd778841164 ("sched/core: Unify p->on_rq updates")
---
kernel/sched/deadline.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 039dde2b1dac..6dafaabde1d6 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq)
}
deactivate_task(rq, next_task, 0);
- sub_running_bw(&next_task->dl, &rq->dl);
- sub_rq_bw(&next_task->dl, &rq->dl);
set_task_cpu(next_task, later_rq->cpu);
- add_rq_bw(&next_task->dl, &later_rq->dl);
/*
* Update the later_rq clock here, because the clock is used
* by the cpufreq_update_util() inside __add_running_bw().
*/
update_rq_clock(later_rq);
- add_running_bw(&next_task->dl, &later_rq->dl);
activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
ret = 1;
@@ -2219,11 +2215,7 @@ static void pull_dl_task(struct rq *this_rq)
resched = true;
deactivate_task(src_rq, p, 0);
- sub_running_bw(&p->dl, &src_rq->dl);
- sub_rq_bw(&p->dl, &src_rq->dl);
set_task_cpu(p, this_cpu);
- add_rq_bw(&p->dl, &this_rq->dl);
- add_running_bw(&p->dl, &this_rq->dl);
activate_task(this_rq, p, 0);
dmin = p->dl.deadline;
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/3] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization()
2019-08-02 14:59 [PATCH v2 0/3] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 1/3] sched/deadline: Fix double accounting of rq/running bw in push & pull Dietmar Eggemann
@ 2019-08-02 14:59 ` Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 3/3] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
2 siblings, 0 replies; 5+ messages in thread
From: Dietmar Eggemann @ 2019-08-02 14:59 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Juri Lelli
Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
Qais Yousef, linux-kernel
dl_change_utilization() has a BUG_ON() to check that no schedutil
kthread (sugov) is entering this function. So instead of calling
sub_running_bw() which checks for the special entity related to a
sugov thread, call the underlying function __sub_running_bw().
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
kernel/sched/deadline.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 6dafaabde1d6..c34e35e7ac23 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -164,7 +164,7 @@ void dl_change_utilization(struct task_struct *p, u64 new_bw)
rq = task_rq(p);
if (p->dl.dl_non_contending) {
- sub_running_bw(&p->dl, &rq->dl);
+ __sub_running_bw(p->dl.dl_bw, &rq->dl);
p->dl.dl_non_contending = 0;
/*
* If the timer handler is currently running and the
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 3/3] sched/deadline: Cleanup on_dl_rq() handling
2019-08-02 14:59 [PATCH v2 0/3] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 1/3] sched/deadline: Fix double accounting of rq/running bw in push & pull Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 2/3] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
@ 2019-08-02 14:59 ` Dietmar Eggemann
2 siblings, 0 replies; 5+ messages in thread
From: Dietmar Eggemann @ 2019-08-02 14:59 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Juri Lelli
Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
Qais Yousef, linux-kernel
Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
enqueue_dl_entity().
Move the check that the dl_se is not on the dl_rq from
__dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
side and use the on_dl_rq() helper function.
BUG_ON in dequeue_dl_entity() instead of silently return. Make this
possible by checking for !p->dl_throttled in dequeue_task_dl() before
calling __dequeue_task_dl(). update_curr_dl() will set
p->dl_throttled=1 in case it already calls __dequeue_task_dl().
The condition !p->dl_throttled && !on_dl_rq() is a BUG.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
kernel/sched/deadline.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index c34e35e7ac23..2add54c8be8a 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
struct sched_dl_entity *entry;
int leftmost = 1;
- BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
-
while (*link) {
parent = *link;
entry = rb_entry(parent, struct sched_dl_entity, rb_node);
@@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
{
struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
- if (RB_EMPTY_NODE(&dl_se->rb_node))
- return;
-
rb_erase_cached(&dl_se->rb_node, &dl_rq->root);
RB_CLEAR_NODE(&dl_se->rb_node);
@@ -1466,6 +1461,8 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
{
+ BUG_ON(!on_dl_rq(dl_se));
+
__dequeue_dl_entity(dl_se);
}
@@ -1544,7 +1541,9 @@ static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
{
update_curr_dl(rq);
- __dequeue_task_dl(rq, p, flags);
+
+ if (!p->dl.dl_throttled)
+ __dequeue_task_dl(rq, p, flags);
if (p->on_rq == TASK_ON_RQ_MIGRATING || flags & DEQUEUE_SAVE) {
sub_running_bw(&p->dl, &rq->dl);
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [tip:sched/urgent] sched/deadline: Fix double accounting of rq/running bw in push & pull
2019-08-02 14:59 ` [PATCH v2 1/3] sched/deadline: Fix double accounting of rq/running bw in push & pull Dietmar Eggemann
@ 2019-08-06 11:19 ` tip-bot for Dietmar Eggemann
0 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Dietmar Eggemann @ 2019-08-06 11:19 UTC (permalink / raw)
To: linux-tip-commits
Cc: hpa, luca.abeni, valentin.schneider, tglx, dietmar.eggemann,
qais.yousef, juri.lelli, bristot, linux-kernel, peterz, mingo
Commit-ID: f4904815f97a934258445a8f763f6b6c48f007e7
Gitweb: https://git.kernel.org/tip/f4904815f97a934258445a8f763f6b6c48f007e7
Author: Dietmar Eggemann <dietmar.eggemann@arm.com>
AuthorDate: Fri, 2 Aug 2019 15:59:43 +0100
Committer: Peter Zijlstra <peterz@infradead.org>
CommitDate: Tue, 6 Aug 2019 12:49:18 +0200
sched/deadline: Fix double accounting of rq/running bw in push & pull
{push,pull}_dl_task() always calls {de,}activate_task() with .flags=0
which sets p->on_rq=TASK_ON_RQ_MIGRATING.
{push,pull}_dl_task()->{de,}activate_task()->{de,en}queue_task()->
{de,en}queue_task_dl() calls {sub,add}_{running,rq}_bw() since
p->on_rq==TASK_ON_RQ_MIGRATING.
So {sub,add}_{running,rq}_bw() in {push,pull}_dl_task() is
double-accounting for that task.
Fix it by removing rq/running bw accounting in [push/pull]_dl_task().
Fixes: 7dd778841164 ("sched/core: Unify p->on_rq updates")
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Qais Yousef <qais.yousef@arm.com>
Link: https://lkml.kernel.org/r/20190802145945.18702-2-dietmar.eggemann@arm.com
---
kernel/sched/deadline.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index ef5b9f6b1d42..46122edd8552 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2088,17 +2088,13 @@ retry:
}
deactivate_task(rq, next_task, 0);
- sub_running_bw(&next_task->dl, &rq->dl);
- sub_rq_bw(&next_task->dl, &rq->dl);
set_task_cpu(next_task, later_rq->cpu);
- add_rq_bw(&next_task->dl, &later_rq->dl);
/*
* Update the later_rq clock here, because the clock is used
* by the cpufreq_update_util() inside __add_running_bw().
*/
update_rq_clock(later_rq);
- add_running_bw(&next_task->dl, &later_rq->dl);
activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
ret = 1;
@@ -2186,11 +2182,7 @@ static void pull_dl_task(struct rq *this_rq)
resched = true;
deactivate_task(src_rq, p, 0);
- sub_running_bw(&p->dl, &src_rq->dl);
- sub_rq_bw(&p->dl, &src_rq->dl);
set_task_cpu(p, this_cpu);
- add_rq_bw(&p->dl, &this_rq->dl);
- add_running_bw(&p->dl, &this_rq->dl);
activate_task(this_rq, p, 0);
dmin = p->dl.deadline;
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-08-06 11:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-02 14:59 [PATCH v2 0/3] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 1/3] sched/deadline: Fix double accounting of rq/running bw in push & pull Dietmar Eggemann
2019-08-06 11:19 ` [tip:sched/urgent] " tip-bot for Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 2/3] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
2019-08-02 14:59 ` [PATCH v2 3/3] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).