linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups
@ 2019-07-26  8:27 Dietmar Eggemann
  2019-07-26  8:27 ` [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Dietmar Eggemann
                   ` (4 more replies)
  0 siblings, 5 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-26  8:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
	Qais Yousef, linux-kernel

While running a simple DL workload (1 DL (12000, 100000, 100000) task
per CPU) on Arm64 & x86 systems I noticed that some of the
SCHED_WARN_ON() in the rq/running bandwidth (bw) functions trigger.
Patch 1/5 contains a proposal to fix this.

Patch 2-5/5 contain various smaller cleanups I discovered while
debugging the actual issue.

Dietmar Eggemann (5):
  sched/deadline: Fix double accounting of rq/running bw in
    push_dl_task()
  sched/deadline: Remove unused int flags from __dequeue_task_dl()
  sched/deadline: Use __sub_running_bw() throughout
    dl_change_utilization()
  sched/deadline: Cleanup on_dl_rq() handling
  sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting

 kernel/sched/deadline.c | 28 ++++++++++------------------
 1 file changed, 10 insertions(+), 18 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-26  8:27 [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
@ 2019-07-26  8:27 ` Dietmar Eggemann
  2019-07-26 10:11   ` luca abeni
  2019-07-26 13:30   ` luca abeni
  2019-07-26  8:27 ` [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl() Dietmar Eggemann
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-26  8:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
	Qais Yousef, linux-kernel

push_dl_task() always calls deactivate_task() with flags=0 which sets
p->on_rq=TASK_ON_RQ_MIGRATING.
push_dl_task()->deactivate_task()->dequeue_task()->dequeue_task_dl()
calls sub_[running/rq]_bw() since p->on_rq=TASK_ON_RQ_MIGRATING.
So sub_[running/rq]_bw() in push_dl_task() is double-accounting for
that task.

The same is true for add_[rq/running]_bw() and activate_task() on the
destination (later) CPU.
push_dl_task()->activate_task()->enqueue_task()->enqueue_task_dl()
calls add_[rq/running]_bw() again since p->on_rq is still set to
TASK_ON_RQ_MIGRATING.
So the add_[rq/running]_bw() in enqueue_task_dl() is double-accounting
for that task.

Fix this by removing the rq/running bw accounting in push_dl_task().

Trace (CONFIG_SCHED_DEBUG=y) before the fix on a 6 CPUs system with 6
DL (12000, 100000, 100000) tasks showing the issue:

[   48.147868] dl_rq->running_bw > old
[   48.147886] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:98
...
[   48.274832]  inactive_task_timer+0x468/0x4e8
[   48.279057]  __hrtimer_run_queues+0x10c/0x3b8
[   48.283364]  hrtimer_interrupt+0xd4/0x250
[   48.287330]  tick_handle_oneshot_broadcast+0x198/0x1d0
...
[   48.360057] dl_rq->running_bw > dl_rq->this_bw
[   48.360065] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:86
...
[   48.488294]  task_contending+0x1a0/0x208
[   48.492172]  enqueue_task_dl+0x3b8/0x970
[   48.496050]  activate_task+0x70/0xd0
[   48.499584]  ttwu_do_activate+0x50/0x78
[   48.503375]  try_to_wake_up+0x270/0x7a0
[   48.507167]  wake_up_process+0x14/0x20
[   48.510873]  hrtimer_wakeup+0x1c/0x30
...
[   50.062867] dl_rq->this_bw > old
[   50.062885] WARNING: CPU: 1 PID: 2048 at kernel/sched/deadline.c:122
...
[   50.190520]  dequeue_task_dl+0x1e4/0x1f8
[   50.194400]  __sched_setscheduler+0x1d0/0x860
[   50.198707]  _sched_setscheduler+0x74/0x98
[   50.202757]  do_sched_setscheduler+0xa8/0x110
[   50.207065]  __arm64_sys_sched_setscheduler+0x1c/0x30

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/deadline.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index de2bd006fe93..d1aeada374e1 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq)
 	}
 
 	deactivate_task(rq, next_task, 0);
-	sub_running_bw(&next_task->dl, &rq->dl);
-	sub_rq_bw(&next_task->dl, &rq->dl);
 	set_task_cpu(next_task, later_rq->cpu);
-	add_rq_bw(&next_task->dl, &later_rq->dl);
 
 	/*
 	 * Update the later_rq clock here, because the clock is used
 	 * by the cpufreq_update_util() inside __add_running_bw().
 	 */
 	update_rq_clock(later_rq);
-	add_running_bw(&next_task->dl, &later_rq->dl);
 	activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
 	ret = 1;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl()
  2019-07-26  8:27 [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
  2019-07-26  8:27 ` [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Dietmar Eggemann
@ 2019-07-26  8:27 ` Dietmar Eggemann
  2019-07-29 16:35   ` Peter Zijlstra
  2019-07-26  8:27 ` [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-26  8:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
	Qais Yousef, linux-kernel

The int flags parameter is not used in __dequeue_task_dl(). Remove it.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/deadline.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index d1aeada374e1..99d4c24a8637 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -637,7 +637,7 @@ static inline void deadline_queue_pull_task(struct rq *rq)
 #endif /* CONFIG_SMP */
 
 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags);
-static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags);
+static void __dequeue_task_dl(struct rq *rq, struct task_struct *p);
 static void check_preempt_curr_dl(struct rq *rq, struct task_struct *p, int flags);
 
 /*
@@ -1245,7 +1245,7 @@ static void update_curr_dl(struct rq *rq)
 		    (dl_se->flags & SCHED_FLAG_DL_OVERRUN))
 			dl_se->dl_overrun = 1;
 
-		__dequeue_task_dl(rq, curr, 0);
+		__dequeue_task_dl(rq, curr);
 		if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr)))
 			enqueue_task_dl(rq, curr, ENQUEUE_REPLENISH);
 
@@ -1535,7 +1535,7 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 		enqueue_pushable_dl_task(rq, p);
 }
 
-static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
+static void __dequeue_task_dl(struct rq *rq, struct task_struct *p)
 {
 	dequeue_dl_entity(&p->dl);
 	dequeue_pushable_dl_task(rq, p);
@@ -1544,7 +1544,7 @@ static void __dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 {
 	update_curr_dl(rq);
-	__dequeue_task_dl(rq, p, flags);
+	__dequeue_task_dl(rq, p);
 
 	if (p->on_rq == TASK_ON_RQ_MIGRATING || flags & DEQUEUE_SAVE) {
 		sub_running_bw(&p->dl, &rq->dl);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization()
  2019-07-26  8:27 [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
  2019-07-26  8:27 ` [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Dietmar Eggemann
  2019-07-26  8:27 ` [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl() Dietmar Eggemann
@ 2019-07-26  8:27 ` Dietmar Eggemann
  2019-07-29 16:47   ` Peter Zijlstra
  2019-07-26  8:27 ` [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
  2019-07-26  8:27 ` [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting Dietmar Eggemann
  4 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-26  8:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
	Qais Yousef, linux-kernel

dl_change_utilization() has a BUG_ON() to check that no schedutil
kthread (sugov) is entering this function. So instead of calling
sub_running_bw() which checks for the special entity related to a
sugov thread, call the underlying function __sub_running_bw().

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/deadline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 99d4c24a8637..1fa005f79307 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -164,7 +164,7 @@ void dl_change_utilization(struct task_struct *p, u64 new_bw)
 
 	rq = task_rq(p);
 	if (p->dl.dl_non_contending) {
-		sub_running_bw(&p->dl, &rq->dl);
+		__sub_running_bw(p->dl.dl_bw, &rq->dl);
 		p->dl.dl_non_contending = 0;
 		/*
 		 * If the timer handler is currently running and the
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-26  8:27 [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
                   ` (2 preceding siblings ...)
  2019-07-26  8:27 ` [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
@ 2019-07-26  8:27 ` Dietmar Eggemann
  2019-07-26  8:37   ` Valentin Schneider
  2019-07-29 16:49   ` Peter Zijlstra
  2019-07-26  8:27 ` [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting Dietmar Eggemann
  4 siblings, 2 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-26  8:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
	Qais Yousef, linux-kernel

Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
enqueue_dl_entity().

Move the check that the dl_se is not on the dl_rq from
__dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
side and use the on_dl_rq() helper function.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/deadline.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 1fa005f79307..a9cb52ceb761 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
 	struct sched_dl_entity *entry;
 	int leftmost = 1;
 
-	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
-
 	while (*link) {
 		parent = *link;
 		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
@@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
 {
 	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
 
-	if (RB_EMPTY_NODE(&dl_se->rb_node))
-		return;
-
 	rb_erase_cached(&dl_se->rb_node, &dl_rq->root);
 	RB_CLEAR_NODE(&dl_se->rb_node);
 
@@ -1466,6 +1461,9 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
 
 static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
 {
+	if (!on_dl_rq(dl_se))
+		return;
+
 	__dequeue_dl_entity(dl_se);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting
  2019-07-26  8:27 [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
                   ` (3 preceding siblings ...)
  2019-07-26  8:27 ` [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
@ 2019-07-26  8:27 ` Dietmar Eggemann
  2019-07-26 10:18   ` luca abeni
  4 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-26  8:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Valentin Schneider,
	Qais Yousef, linux-kernel

To make the decision whether to set rq or running bw to 0 in underflow
case use the return value of SCHED_WARN_ON() rather than an extra if
condition.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/deadline.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index a9cb52ceb761..66c594b5507e 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -95,8 +95,7 @@ void __sub_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 
 	lockdep_assert_held(&(rq_of_dl_rq(dl_rq))->lock);
 	dl_rq->running_bw -= dl_bw;
-	SCHED_WARN_ON(dl_rq->running_bw > old); /* underflow */
-	if (dl_rq->running_bw > old)
+	if (SCHED_WARN_ON(dl_rq->running_bw > old)) /* underflow */
 		dl_rq->running_bw = 0;
 	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
 	cpufreq_update_util(rq_of_dl_rq(dl_rq), 0);
@@ -119,8 +118,7 @@ void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
 
 	lockdep_assert_held(&(rq_of_dl_rq(dl_rq))->lock);
 	dl_rq->this_bw -= dl_bw;
-	SCHED_WARN_ON(dl_rq->this_bw > old); /* underflow */
-	if (dl_rq->this_bw > old)
+	if (SCHED_WARN_ON(dl_rq->this_bw > old)) /* underflow */
 		dl_rq->this_bw = 0;
 	SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-26  8:27 ` [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
@ 2019-07-26  8:37   ` Valentin Schneider
  2019-07-26  8:58     ` Qais Yousef
  2019-07-26  9:20     ` Juri Lelli
  2019-07-29 16:49   ` Peter Zijlstra
  1 sibling, 2 replies; 31+ messages in thread
From: Valentin Schneider @ 2019-07-26  8:37 UTC (permalink / raw)
  To: Dietmar Eggemann, Peter Zijlstra, Ingo Molnar, Juri Lelli
  Cc: Luca Abeni, Daniel Bristot de Oliveira, Qais Yousef, linux-kernel

On 26/07/2019 09:27, Dietmar Eggemann wrote:
> Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
> enqueue_dl_entity().
> 
> Move the check that the dl_se is not on the dl_rq from
> __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
> side and use the on_dl_rq() helper function.
> 
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  kernel/sched/deadline.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 1fa005f79307..a9cb52ceb761 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
>  	struct sched_dl_entity *entry;
>  	int leftmost = 1;
>  
> -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
> -
>  	while (*link) {
>  		parent = *link;
>  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
> @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
>  {
>  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
>  
> -	if (RB_EMPTY_NODE(&dl_se->rb_node))
> -		return;
> -

Any idea why a similar error leads to a BUG_ON() in the enqueue path but
only a silent return on the dequeue path? I would expect the handling to be
almost identical.
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-26  8:37   ` Valentin Schneider
@ 2019-07-26  8:58     ` Qais Yousef
  2019-07-26  9:20     ` Juri Lelli
  1 sibling, 0 replies; 31+ messages in thread
From: Qais Yousef @ 2019-07-26  8:58 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Dietmar Eggemann, Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Luca Abeni, Daniel Bristot de Oliveira, linux-kernel

On 07/26/19 09:37, Valentin Schneider wrote:
> On 26/07/2019 09:27, Dietmar Eggemann wrote:
> > Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
> > enqueue_dl_entity().
> > 
> > Move the check that the dl_se is not on the dl_rq from
> > __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
> > side and use the on_dl_rq() helper function.
> > 
> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > ---
> >  kernel/sched/deadline.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 1fa005f79307..a9cb52ceb761 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
> >  	struct sched_dl_entity *entry;
> >  	int leftmost = 1;
> >  
> > -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
> > -
> >  	while (*link) {
> >  		parent = *link;
> >  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
> > @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
> >  {
> >  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> >  
> > -	if (RB_EMPTY_NODE(&dl_se->rb_node))
> > -		return;
> > -
> 
> Any idea why a similar error leads to a BUG_ON() in the enqueue path but
> only a silent return on the dequeue path? I would expect the handling to be
> almost identical.

nit: s/BUG_ON/WARN_ON/

--
Qais Yousef

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-26  8:37   ` Valentin Schneider
  2019-07-26  8:58     ` Qais Yousef
@ 2019-07-26  9:20     ` Juri Lelli
  2019-07-26  9:32       ` Valentin Schneider
  1 sibling, 1 reply; 31+ messages in thread
From: Juri Lelli @ 2019-07-26  9:20 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Dietmar Eggemann, Peter Zijlstra, Ingo Molnar, Luca Abeni,
	Daniel Bristot de Oliveira, Qais Yousef, linux-kernel

Hi,

On 26/07/19 09:37, Valentin Schneider wrote:
> On 26/07/2019 09:27, Dietmar Eggemann wrote:
> > Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
> > enqueue_dl_entity().
> > 
> > Move the check that the dl_se is not on the dl_rq from
> > __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
> > side and use the on_dl_rq() helper function.
> > 
> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > ---
> >  kernel/sched/deadline.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 1fa005f79307..a9cb52ceb761 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
> >  	struct sched_dl_entity *entry;
> >  	int leftmost = 1;
> >  
> > -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
> > -
> >  	while (*link) {
> >  		parent = *link;
> >  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
> > @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
> >  {
> >  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> >  
> > -	if (RB_EMPTY_NODE(&dl_se->rb_node))
> > -		return;
> > -
> 
> Any idea why a similar error leads to a BUG_ON() in the enqueue path but
> only a silent return on the dequeue path? I would expect the handling to be
> almost identical.
>  

Task could have already been dequeued by update_curr_dl()->throttle
called by dequeue_task_dl() before calling __dequeue_task_dl().

Thanks,

Juri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-26  9:20     ` Juri Lelli
@ 2019-07-26  9:32       ` Valentin Schneider
  0 siblings, 0 replies; 31+ messages in thread
From: Valentin Schneider @ 2019-07-26  9:32 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Dietmar Eggemann, Peter Zijlstra, Ingo Molnar, Luca Abeni,
	Daniel Bristot de Oliveira, Qais Yousef, linux-kernel

On 26/07/2019 10:20, Juri Lelli wrote:
>> Any idea why a similar error leads to a BUG_ON() in the enqueue path but
>> only a silent return on the dequeue path? I would expect the handling to be
>> almost identical.
>>  
> 
> Task could have already been dequeued by update_curr_dl()->throttle
> called by dequeue_task_dl() before calling __dequeue_task_dl().
> 

Got it, thanks.

> Thanks,
> 
> Juri
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-26  8:27 ` [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Dietmar Eggemann
@ 2019-07-26 10:11   ` luca abeni
  2019-07-29  8:59     ` Dietmar Eggemann
  2019-07-26 13:30   ` luca abeni
  1 sibling, 1 reply; 31+ messages in thread
From: luca abeni @ 2019-07-26 10:11 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

Hi Dietmar,

On Fri, 26 Jul 2019 09:27:52 +0100
Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:

> push_dl_task() always calls deactivate_task() with flags=0 which sets
> p->on_rq=TASK_ON_RQ_MIGRATING.

Uhm... This is a recent change in the deactivate_task() behaviour,
right? Because I tested SCHED_DEADLINE a lot, but I've never seen this
issue :)


Anyway, looking at the current code the change looks OK. Thanks for
fixing this issue!


			Luca

> push_dl_task()->deactivate_task()->dequeue_task()->dequeue_task_dl()
> calls sub_[running/rq]_bw() since p->on_rq=TASK_ON_RQ_MIGRATING.
> So sub_[running/rq]_bw() in push_dl_task() is double-accounting for
> that task.
> 
> The same is true for add_[rq/running]_bw() and activate_task() on the
> destination (later) CPU.
> push_dl_task()->activate_task()->enqueue_task()->enqueue_task_dl()
> calls add_[rq/running]_bw() again since p->on_rq is still set to
> TASK_ON_RQ_MIGRATING.
> So the add_[rq/running]_bw() in enqueue_task_dl() is double-accounting
> for that task.
> 
> Fix this by removing the rq/running bw accounting in push_dl_task().
> 
> Trace (CONFIG_SCHED_DEBUG=y) before the fix on a 6 CPUs system with 6
> DL (12000, 100000, 100000) tasks showing the issue:
> 
> [   48.147868] dl_rq->running_bw > old
> [   48.147886] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:98
> ...
> [   48.274832]  inactive_task_timer+0x468/0x4e8
> [   48.279057]  __hrtimer_run_queues+0x10c/0x3b8
> [   48.283364]  hrtimer_interrupt+0xd4/0x250
> [   48.287330]  tick_handle_oneshot_broadcast+0x198/0x1d0
> ...
> [   48.360057] dl_rq->running_bw > dl_rq->this_bw
> [   48.360065] WARNING: CPU: 1 PID: 0 at kernel/sched/deadline.c:86
> ...
> [   48.488294]  task_contending+0x1a0/0x208
> [   48.492172]  enqueue_task_dl+0x3b8/0x970
> [   48.496050]  activate_task+0x70/0xd0
> [   48.499584]  ttwu_do_activate+0x50/0x78
> [   48.503375]  try_to_wake_up+0x270/0x7a0
> [   48.507167]  wake_up_process+0x14/0x20
> [   48.510873]  hrtimer_wakeup+0x1c/0x30
> ...
> [   50.062867] dl_rq->this_bw > old
> [   50.062885] WARNING: CPU: 1 PID: 2048 at
> kernel/sched/deadline.c:122 ...
> [   50.190520]  dequeue_task_dl+0x1e4/0x1f8
> [   50.194400]  __sched_setscheduler+0x1d0/0x860
> [   50.198707]  _sched_setscheduler+0x74/0x98
> [   50.202757]  do_sched_setscheduler+0xa8/0x110
> [   50.207065]  __arm64_sys_sched_setscheduler+0x1c/0x30
> 
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  kernel/sched/deadline.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index de2bd006fe93..d1aeada374e1 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq)
>  	}
>  
>  	deactivate_task(rq, next_task, 0);
> -	sub_running_bw(&next_task->dl, &rq->dl);
> -	sub_rq_bw(&next_task->dl, &rq->dl);
>  	set_task_cpu(next_task, later_rq->cpu);
> -	add_rq_bw(&next_task->dl, &later_rq->dl);
>  
>  	/*
>  	 * Update the later_rq clock here, because the clock is used
>  	 * by the cpufreq_update_util() inside __add_running_bw().
>  	 */
>  	update_rq_clock(later_rq);
> -	add_running_bw(&next_task->dl, &later_rq->dl);
>  	activate_task(later_rq, next_task, ENQUEUE_NOCLOCK);
>  	ret = 1;
>  


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting
  2019-07-26  8:27 ` [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting Dietmar Eggemann
@ 2019-07-26 10:18   ` luca abeni
  2019-07-29 16:54     ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: luca abeni @ 2019-07-26 10:18 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

Hi Dietmar,

On Fri, 26 Jul 2019 09:27:56 +0100
Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:

> To make the decision whether to set rq or running bw to 0 in underflow
> case use the return value of SCHED_WARN_ON() rather than an extra if
> condition.

I think I tried this at some point, but if I remember well this
solution does not work correctly when SCHED_DEBUG is not enabled.



			Luca


> 
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  kernel/sched/deadline.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index a9cb52ceb761..66c594b5507e 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -95,8 +95,7 @@ void __sub_running_bw(u64 dl_bw, struct dl_rq
> *dl_rq) 
>  	lockdep_assert_held(&(rq_of_dl_rq(dl_rq))->lock);
>  	dl_rq->running_bw -= dl_bw;
> -	SCHED_WARN_ON(dl_rq->running_bw > old); /* underflow */
> -	if (dl_rq->running_bw > old)
> +	if (SCHED_WARN_ON(dl_rq->running_bw > old)) /* underflow */
>  		dl_rq->running_bw = 0;
>  	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
>  	cpufreq_update_util(rq_of_dl_rq(dl_rq), 0);
> @@ -119,8 +118,7 @@ void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
>  
>  	lockdep_assert_held(&(rq_of_dl_rq(dl_rq))->lock);
>  	dl_rq->this_bw -= dl_bw;
> -	SCHED_WARN_ON(dl_rq->this_bw > old); /* underflow */
> -	if (dl_rq->this_bw > old)
> +	if (SCHED_WARN_ON(dl_rq->this_bw > old)) /* underflow */
>  		dl_rq->this_bw = 0;
>  	SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
>  }


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-26  8:27 ` [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Dietmar Eggemann
  2019-07-26 10:11   ` luca abeni
@ 2019-07-26 13:30   ` luca abeni
  2019-07-29  9:00     ` Dietmar Eggemann
  1 sibling, 1 reply; 31+ messages in thread
From: luca abeni @ 2019-07-26 13:30 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

Hi,

On Fri, 26 Jul 2019 09:27:52 +0100
Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
[...]
> @@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq)
>  	}
>  
>  	deactivate_task(rq, next_task, 0);
> -	sub_running_bw(&next_task->dl, &rq->dl);
> -	sub_rq_bw(&next_task->dl, &rq->dl);
>  	set_task_cpu(next_task, later_rq->cpu);
> -	add_rq_bw(&next_task->dl, &later_rq->dl);
>  
>  	/*
>  	 * Update the later_rq clock here, because the clock is used
>  	 * by the cpufreq_update_util() inside __add_running_bw().
>  	 */
>  	update_rq_clock(later_rq);
> -	add_running_bw(&next_task->dl, &later_rq->dl);

Looking at the code again and thinking a little bit more about this
issue, I suspect a similar change is needed in pull_dl_task() too, no?



			Luca

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-26 10:11   ` luca abeni
@ 2019-07-29  8:59     ` Dietmar Eggemann
  2019-07-29 16:10       ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-29  8:59 UTC (permalink / raw)
  To: luca abeni
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On 7/26/19 11:11 AM, luca abeni wrote:
> Hi Dietmar,
> 
> On Fri, 26 Jul 2019 09:27:52 +0100
> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> 
>> push_dl_task() always calls deactivate_task() with flags=0 which sets
>> p->on_rq=TASK_ON_RQ_MIGRATING.
> 
> Uhm... This is a recent change in the deactivate_task() behaviour,
> right? Because I tested SCHED_DEADLINE a lot, but I've never seen this
> issue :)

Looks like it was v5.2 commit 7dd778841164 ("sched/core: Unify p->on_rq
updates").

> Anyway, looking at the current code the change looks OK. Thanks for
> fixing this issue!

[...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-26 13:30   ` luca abeni
@ 2019-07-29  9:00     ` Dietmar Eggemann
  2019-07-31 10:32       ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-29  9:00 UTC (permalink / raw)
  To: luca abeni
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On 7/26/19 2:30 PM, luca abeni wrote:
> Hi,
> 
> On Fri, 26 Jul 2019 09:27:52 +0100
> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> [...]
>> @@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq)
>>  	}
>>  
>>  	deactivate_task(rq, next_task, 0);
>> -	sub_running_bw(&next_task->dl, &rq->dl);
>> -	sub_rq_bw(&next_task->dl, &rq->dl);
>>  	set_task_cpu(next_task, later_rq->cpu);
>> -	add_rq_bw(&next_task->dl, &later_rq->dl);
>>  
>>  	/*
>>  	 * Update the later_rq clock here, because the clock is used
>>  	 * by the cpufreq_update_util() inside __add_running_bw().
>>  	 */
>>  	update_rq_clock(later_rq);
>> -	add_running_bw(&next_task->dl, &later_rq->dl);
> 
> Looking at the code again and thinking a little bit more about this
> issue, I suspect a similar change is needed in pull_dl_task() too, no?

The code looks the same. Let me try to test it. I will add it in v2 then.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-29  8:59     ` Dietmar Eggemann
@ 2019-07-29 16:10       ` Peter Zijlstra
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-29 16:10 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: luca abeni, Ingo Molnar, Juri Lelli, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On Mon, Jul 29, 2019 at 09:59:28AM +0100, Dietmar Eggemann wrote:
> On 7/26/19 11:11 AM, luca abeni wrote:
> > Hi Dietmar,
> > 
> > On Fri, 26 Jul 2019 09:27:52 +0100
> > Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> > 
> >> push_dl_task() always calls deactivate_task() with flags=0 which sets
> >> p->on_rq=TASK_ON_RQ_MIGRATING.
> > 
> > Uhm... This is a recent change in the deactivate_task() behaviour,
> > right? Because I tested SCHED_DEADLINE a lot, but I've never seen this
> > issue :)
> 
> Looks like it was v5.2 commit 7dd778841164 ("sched/core: Unify p->on_rq
> updates").

Argh, that wasn't intentional (obviously); please post v2 with that as
Fixes and I'll get it in sched/urgent and stable.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl()
  2019-07-26  8:27 ` [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl() Dietmar Eggemann
@ 2019-07-29 16:35   ` Peter Zijlstra
  2019-07-29 17:12     ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-29 16:35 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Ingo Molnar, Juri Lelli, Luca Abeni, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On Fri, Jul 26, 2019 at 09:27:53AM +0100, Dietmar Eggemann wrote:
> The int flags parameter is not used in __dequeue_task_dl(). Remove it.

I just posted a patch(es) that will actually make use of it and extends
the flags argument into dequeue_dl_entity().

  https://lkml.kernel.org/r/20190726161357.999133690@infradead.org

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization()
  2019-07-26  8:27 ` [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
@ 2019-07-29 16:47   ` Peter Zijlstra
  2019-07-29 17:21     ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-29 16:47 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Ingo Molnar, Juri Lelli, Luca Abeni, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On Fri, Jul 26, 2019 at 09:27:54AM +0100, Dietmar Eggemann wrote:
> dl_change_utilization() has a BUG_ON() to check that no schedutil
> kthread (sugov) is entering this function. So instead of calling
> sub_running_bw() which checks for the special entity related to a
> sugov thread, call the underlying function __sub_running_bw().
> 
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  kernel/sched/deadline.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99d4c24a8637..1fa005f79307 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -164,7 +164,7 @@ void dl_change_utilization(struct task_struct *p, u64 new_bw)
>  
>  	rq = task_rq(p);
>  	if (p->dl.dl_non_contending) {
> -		sub_running_bw(&p->dl, &rq->dl);
> +		__sub_running_bw(p->dl.dl_bw, &rq->dl);
>  		p->dl.dl_non_contending = 0;
>  		/*
>  		 * If the timer handler is currently running and the

I'm confused; the only called of dl_change_utilization() is
sched_dl_overflow(), and that already checks FLAG_SUGOV and exits before
calling.

So how can this matter?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-26  8:27 ` [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
  2019-07-26  8:37   ` Valentin Schneider
@ 2019-07-29 16:49   ` Peter Zijlstra
  2019-07-30  6:41     ` Juri Lelli
  1 sibling, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-29 16:49 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Ingo Molnar, Juri Lelli, Luca Abeni, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On Fri, Jul 26, 2019 at 09:27:55AM +0100, Dietmar Eggemann wrote:
> Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
> enqueue_dl_entity().
> 
> Move the check that the dl_se is not on the dl_rq from
> __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
> side and use the on_dl_rq() helper function.
> 
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  kernel/sched/deadline.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 1fa005f79307..a9cb52ceb761 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
>  	struct sched_dl_entity *entry;
>  	int leftmost = 1;
>  
> -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
> -
>  	while (*link) {
>  		parent = *link;
>  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
> @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
>  {
>  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
>  
> -	if (RB_EMPTY_NODE(&dl_se->rb_node))
> -		return;
> -
>  	rb_erase_cached(&dl_se->rb_node, &dl_rq->root);
>  	RB_CLEAR_NODE(&dl_se->rb_node);
>  
> @@ -1466,6 +1461,9 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
>  
>  static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
>  {
> +	if (!on_dl_rq(dl_se))
> +		return;

Why allow double dequeue instead of WARN?

> +
>  	__dequeue_dl_entity(dl_se);
>  }
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting
  2019-07-26 10:18   ` luca abeni
@ 2019-07-29 16:54     ` Peter Zijlstra
  2019-07-29 16:59       ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-29 16:54 UTC (permalink / raw)
  To: luca abeni
  Cc: Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On Fri, Jul 26, 2019 at 12:18:19PM +0200, luca abeni wrote:
> Hi Dietmar,
> 
> On Fri, 26 Jul 2019 09:27:56 +0100
> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> 
> > To make the decision whether to set rq or running bw to 0 in underflow
> > case use the return value of SCHED_WARN_ON() rather than an extra if
> > condition.
> 
> I think I tried this at some point, but if I remember well this
> solution does not work correctly when SCHED_DEBUG is not enabled.

Well, it 'works' in so far that it compiles. But it might not be what
one expects. That is, for !SCHED_DEBUG the return value is an
unconditional false.

In this case I think that's fine, the WARN _should_ not be happending.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting
  2019-07-29 16:54     ` Peter Zijlstra
@ 2019-07-29 16:59       ` Dietmar Eggemann
  2019-07-30  8:23         ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-29 16:59 UTC (permalink / raw)
  To: Peter Zijlstra, luca abeni
  Cc: Ingo Molnar, Juri Lelli, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On 7/29/19 5:54 PM, Peter Zijlstra wrote:
> On Fri, Jul 26, 2019 at 12:18:19PM +0200, luca abeni wrote:
>> Hi Dietmar,
>>
>> On Fri, 26 Jul 2019 09:27:56 +0100
>> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>>
>>> To make the decision whether to set rq or running bw to 0 in underflow
>>> case use the return value of SCHED_WARN_ON() rather than an extra if
>>> condition.
>>
>> I think I tried this at some point, but if I remember well this
>> solution does not work correctly when SCHED_DEBUG is not enabled.
> 
> Well, it 'works' in so far that it compiles. But it might not be what
> one expects. That is, for !SCHED_DEBUG the return value is an
> unconditional false.
> 
> In this case I think that's fine, the WARN _should_ not be happending.

But there is commit 6d3aed3d ("sched/debug: Fix SCHED_WARN_ON() to
return a value on !CONFIG_SCHED_DEBUG as well")?

But it doesn't work with !CONFIG_SCHED_DEBUG

What compiles and work is (at least on Arm64).

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4012f98b9d26..494a767a4091 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -78,7 +78,7 @@
 #ifdef CONFIG_SCHED_DEBUG
 # define SCHED_WARN_ON(x)      WARN_ONCE(x, #x)
 #else
-# define SCHED_WARN_ON(x)      ({ (void)(x), 0; })
+# define SCHED_WARN_ON(x)      ({ (void)(x), x; })



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl()
  2019-07-29 16:35   ` Peter Zijlstra
@ 2019-07-29 17:12     ` Dietmar Eggemann
  0 siblings, 0 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-29 17:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Luca Abeni, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On 7/29/19 5:35 PM, Peter Zijlstra wrote:
> On Fri, Jul 26, 2019 at 09:27:53AM +0100, Dietmar Eggemann wrote:
>> The int flags parameter is not used in __dequeue_task_dl(). Remove it.
> 
> I just posted a patch(es) that will actually make use of it and extends
> the flags argument into dequeue_dl_entity().
> 
>   https://lkml.kernel.org/r/20190726161357.999133690@infradead.org

I see, I will skip this one then.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization()
  2019-07-29 16:47   ` Peter Zijlstra
@ 2019-07-29 17:21     ` Dietmar Eggemann
  0 siblings, 0 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-29 17:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Luca Abeni, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On 7/29/19 5:47 PM, Peter Zijlstra wrote:
> On Fri, Jul 26, 2019 at 09:27:54AM +0100, Dietmar Eggemann wrote:
>> dl_change_utilization() has a BUG_ON() to check that no schedutil
>> kthread (sugov) is entering this function. So instead of calling
>> sub_running_bw() which checks for the special entity related to a
>> sugov thread, call the underlying function __sub_running_bw().
>>
>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>> ---
>>  kernel/sched/deadline.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index 99d4c24a8637..1fa005f79307 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -164,7 +164,7 @@ void dl_change_utilization(struct task_struct *p, u64 new_bw)
>>  
>>  	rq = task_rq(p);
>>  	if (p->dl.dl_non_contending) {
>> -		sub_running_bw(&p->dl, &rq->dl);
>> +		__sub_running_bw(p->dl.dl_bw, &rq->dl);
>>  		p->dl.dl_non_contending = 0;
>>  		/*
>>  		 * If the timer handler is currently running and the
> 
> I'm confused; the only called of dl_change_utilization() is
> sched_dl_overflow(), and that already checks FLAG_SUGOV and exits before
> calling.

That's right. There is even a BUG_ON(p->dl.flags & SCHED_FLAG_SUGOV) at
the beginning of dl_change_utilization().

> So how can this matter?

We save the 'if (!dl_entity_is_special(dl_se))' from sub_running_bw()
when we call __sub_running_bw() since we know it can't be a special task.

Later in dl_change_utilization() we already use the inner bw accounting
functions __sub_rq_bw() and __add_rq_bw().

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-29 16:49   ` Peter Zijlstra
@ 2019-07-30  6:41     ` Juri Lelli
  2019-07-30  8:21       ` Peter Zijlstra
  0 siblings, 1 reply; 31+ messages in thread
From: Juri Lelli @ 2019-07-30  6:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dietmar Eggemann, Ingo Molnar, Luca Abeni,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On 29/07/19 18:49, Peter Zijlstra wrote:
> On Fri, Jul 26, 2019 at 09:27:55AM +0100, Dietmar Eggemann wrote:
> > Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
> > enqueue_dl_entity().
> > 
> > Move the check that the dl_se is not on the dl_rq from
> > __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
> > side and use the on_dl_rq() helper function.
> > 
> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > ---
> >  kernel/sched/deadline.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 1fa005f79307..a9cb52ceb761 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
> >  	struct sched_dl_entity *entry;
> >  	int leftmost = 1;
> >  
> > -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
> > -
> >  	while (*link) {
> >  		parent = *link;
> >  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
> > @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
> >  {
> >  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> >  
> > -	if (RB_EMPTY_NODE(&dl_se->rb_node))
> > -		return;
> > -
> >  	rb_erase_cached(&dl_se->rb_node, &dl_rq->root);
> >  	RB_CLEAR_NODE(&dl_se->rb_node);
> >  
> > @@ -1466,6 +1461,9 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
> >  
> >  static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
> >  {
> > +	if (!on_dl_rq(dl_se))
> > +		return;
> 
> Why allow double dequeue instead of WARN?

As I was saying to Valentin, it can currently happen that a task could
have already been dequeued by update_curr_dl()->throttle called by
dequeue_task_dl() before calling __dequeue_task_dl(). Do you think we
should check for this condition before calling into dequeue_dl_entity()?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-30  6:41     ` Juri Lelli
@ 2019-07-30  8:21       ` Peter Zijlstra
  2019-07-31 17:32         ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-30  8:21 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Dietmar Eggemann, Ingo Molnar, Luca Abeni,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On Tue, Jul 30, 2019 at 08:41:15AM +0200, Juri Lelli wrote:
> On 29/07/19 18:49, Peter Zijlstra wrote:
> > On Fri, Jul 26, 2019 at 09:27:55AM +0100, Dietmar Eggemann wrote:
> > > Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
> > > enqueue_dl_entity().
> > > 
> > > Move the check that the dl_se is not on the dl_rq from
> > > __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
> > > side and use the on_dl_rq() helper function.
> > > 
> > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > > ---
> > >  kernel/sched/deadline.c | 8 +++-----
> > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > > index 1fa005f79307..a9cb52ceb761 100644
> > > --- a/kernel/sched/deadline.c
> > > +++ b/kernel/sched/deadline.c
> > > @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
> > >  	struct sched_dl_entity *entry;
> > >  	int leftmost = 1;
> > >  
> > > -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
> > > -
> > >  	while (*link) {
> > >  		parent = *link;
> > >  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
> > > @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
> > >  {
> > >  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> > >  
> > > -	if (RB_EMPTY_NODE(&dl_se->rb_node))
> > > -		return;
> > > -
> > >  	rb_erase_cached(&dl_se->rb_node, &dl_rq->root);
> > >  	RB_CLEAR_NODE(&dl_se->rb_node);
> > >  
> > > @@ -1466,6 +1461,9 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
> > >  
> > >  static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
> > >  {
> > > +	if (!on_dl_rq(dl_se))
> > > +		return;
> > 
> > Why allow double dequeue instead of WARN?
> 
> As I was saying to Valentin, it can currently happen that a task could
> have already been dequeued by update_curr_dl()->throttle called by
> dequeue_task_dl() before calling __dequeue_task_dl(). Do you think we
> should check for this condition before calling into dequeue_dl_entity()?

Yes, that's what ->dl_throttled is for, right? And !->dl_throttled &&
!on_dl_rq() is a BUG.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting
  2019-07-29 16:59       ` Dietmar Eggemann
@ 2019-07-30  8:23         ` Peter Zijlstra
  2019-07-30 16:08           ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Zijlstra @ 2019-07-30  8:23 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: luca abeni, Ingo Molnar, Juri Lelli, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On Mon, Jul 29, 2019 at 05:59:04PM +0100, Dietmar Eggemann wrote:
> On 7/29/19 5:54 PM, Peter Zijlstra wrote:
> > On Fri, Jul 26, 2019 at 12:18:19PM +0200, luca abeni wrote:
> >> Hi Dietmar,
> >>
> >> On Fri, 26 Jul 2019 09:27:56 +0100
> >> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> >>
> >>> To make the decision whether to set rq or running bw to 0 in underflow
> >>> case use the return value of SCHED_WARN_ON() rather than an extra if
> >>> condition.
> >>
> >> I think I tried this at some point, but if I remember well this
> >> solution does not work correctly when SCHED_DEBUG is not enabled.
> > 
> > Well, it 'works' in so far that it compiles. But it might not be what
> > one expects. That is, for !SCHED_DEBUG the return value is an
> > unconditional false.
> > 
> > In this case I think that's fine, the WARN _should_ not be happending.
> 
> But there is commit 6d3aed3d ("sched/debug: Fix SCHED_WARN_ON() to
> return a value on !CONFIG_SCHED_DEBUG as well")?
> 
> But it doesn't work with !CONFIG_SCHED_DEBUG
> 
> What compiles and work is (at least on Arm64).
> 
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 4012f98b9d26..494a767a4091 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -78,7 +78,7 @@
>  #ifdef CONFIG_SCHED_DEBUG
>  # define SCHED_WARN_ON(x)      WARN_ONCE(x, #x)
>  #else
> -# define SCHED_WARN_ON(x)      ({ (void)(x), 0; })
> +# define SCHED_WARN_ON(x)      ({ (void)(x), x; })

Why doesn't the ,0 compile? That should work just fine. The thing is,
the two existing users:

kernel/sched/fair.c:            if (SCHED_WARN_ON(!se->on_rq))
kernel/sched/fair.c:            if (SCHED_WARN_ON(!se->on_rq))

seem to compile just fine with it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting
  2019-07-30  8:23         ` Peter Zijlstra
@ 2019-07-30 16:08           ` Dietmar Eggemann
  0 siblings, 0 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-30 16:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: luca abeni, Ingo Molnar, Juri Lelli, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On 7/30/19 9:23 AM, Peter Zijlstra wrote:
> On Mon, Jul 29, 2019 at 05:59:04PM +0100, Dietmar Eggemann wrote:
>> On 7/29/19 5:54 PM, Peter Zijlstra wrote:
>>> On Fri, Jul 26, 2019 at 12:18:19PM +0200, luca abeni wrote:
>>>> Hi Dietmar,
>>>>
>>>> On Fri, 26 Jul 2019 09:27:56 +0100
>>>> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>>>>
>>>>> To make the decision whether to set rq or running bw to 0 in underflow
>>>>> case use the return value of SCHED_WARN_ON() rather than an extra if
>>>>> condition.
>>>>
>>>> I think I tried this at some point, but if I remember well this
>>>> solution does not work correctly when SCHED_DEBUG is not enabled.
>>>
>>> Well, it 'works' in so far that it compiles. But it might not be what
>>> one expects. That is, for !SCHED_DEBUG the return value is an
>>> unconditional false.
>>>
>>> In this case I think that's fine, the WARN _should_ not be happending.
>>
>> But there is commit 6d3aed3d ("sched/debug: Fix SCHED_WARN_ON() to
>> return a value on !CONFIG_SCHED_DEBUG as well")?
>>
>> But it doesn't work with !CONFIG_SCHED_DEBUG
>>
>> What compiles and work is (at least on Arm64).
>>
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index 4012f98b9d26..494a767a4091 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -78,7 +78,7 @@
>>  #ifdef CONFIG_SCHED_DEBUG
>>  # define SCHED_WARN_ON(x)      WARN_ONCE(x, #x)
>>  #else
>> -# define SCHED_WARN_ON(x)      ({ (void)(x), 0; })
>> +# define SCHED_WARN_ON(x)      ({ (void)(x), x; })
> 
> Why doesn't the ,0 compile? That should work just fine. The thing is,
> the two existing users:
> 
> kernel/sched/fair.c:            if (SCHED_WARN_ON(!se->on_rq))
> kernel/sched/fair.c:            if (SCHED_WARN_ON(!se->on_rq))
> 
> seem to compile just fine with it.

You're right, compiling is not an issue. But it looks like the code does
different things depending on CONFIG_SCHED_DEBUG.

E.g. in set_last_buddy() we would return if '!se->on_rq' with
CONFIG_SCHED_DEBUG and continue the for_each_sched_entity() with
!CONFIG_SCHED_DEBUG.

IMHO, this forced Luca e.g. in __sub_running_bw() to code:

SCHED_WARN_ON(dl_rq->running_bw > old);
if (dl_rq->running_bw > old)
    dl_rq->running_bw = 0;

and not:

if (SCHED_WARN_ON(dl_rq->running_bw > old))
    dl_rq->running_bw = 0;

I experimented with '# define SCHED_WARN_ON(x) ({ (void)(x), x; })' and
in this case code inside an 'if (SCHED_WARN_ON(cond))' is also executed
with !CONFIG_SCHED_DEBUG.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task()
  2019-07-29  9:00     ` Dietmar Eggemann
@ 2019-07-31 10:32       ` Dietmar Eggemann
  0 siblings, 0 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-31 10:32 UTC (permalink / raw)
  To: luca abeni
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On 7/29/19 10:00 AM, Dietmar Eggemann wrote:
> On 7/26/19 2:30 PM, luca abeni wrote:
>> Hi,
>>
>> On Fri, 26 Jul 2019 09:27:52 +0100
>> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>> [...]
>>> @@ -2121,17 +2121,13 @@ static int push_dl_task(struct rq *rq)
>>>  	}
>>>  
>>>  	deactivate_task(rq, next_task, 0);
>>> -	sub_running_bw(&next_task->dl, &rq->dl);
>>> -	sub_rq_bw(&next_task->dl, &rq->dl);
>>>  	set_task_cpu(next_task, later_rq->cpu);
>>> -	add_rq_bw(&next_task->dl, &later_rq->dl);
>>>  
>>>  	/*
>>>  	 * Update the later_rq clock here, because the clock is used
>>>  	 * by the cpufreq_update_util() inside __add_running_bw().
>>>  	 */
>>>  	update_rq_clock(later_rq);
>>> -	add_running_bw(&next_task->dl, &later_rq->dl);
>>
>> Looking at the code again and thinking a little bit more about this
>> issue, I suspect a similar change is needed in pull_dl_task() too, no?
> 
> The code looks the same. Let me try to test it. I will add it in v2 then.

Like you expected, it happens on the pull side as well.

[  129.813720] --> CPU7: pull_dl_task: p=[thread0-6 1313] dl_bw=125829
src_rq->dl.running_bw=251658 this_rq->dl.running_bw=125829
[  129.825167] <-- CPU7: pull_dl_task: p=[thread0-6 1313] dl_bw=125829
src_rq->dl.running_bw=0 this_rq->dl.running_bw=377487
...
[  129.948528] dl_rq->running_bw > old
[  129.948568] WARNING: CPU: 7 PID: 0 at kernel/sched/deadline.c:117
inactive_task_timer+0x5e8/0x6b8
...
[  130.077158]  inactive_task_timer+0x5e8/0x6b8
[  130.081427]  __hrtimer_run_queues+0x12c/0x398
[  130.085782]  hrtimer_interrupt+0xfc/0x330
...

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-30  8:21       ` Peter Zijlstra
@ 2019-07-31 17:32         ` Dietmar Eggemann
  2019-07-31 20:20           ` luca abeni
  0 siblings, 1 reply; 31+ messages in thread
From: Dietmar Eggemann @ 2019-07-31 17:32 UTC (permalink / raw)
  To: Peter Zijlstra, Juri Lelli
  Cc: Ingo Molnar, Luca Abeni, Daniel Bristot de Oliveira,
	Valentin Schneider, Qais Yousef, linux-kernel

On 7/30/19 9:21 AM, Peter Zijlstra wrote:
> On Tue, Jul 30, 2019 at 08:41:15AM +0200, Juri Lelli wrote:
>> On 29/07/19 18:49, Peter Zijlstra wrote:
>>> On Fri, Jul 26, 2019 at 09:27:55AM +0100, Dietmar Eggemann wrote:
>>>> Remove BUG_ON() in __enqueue_dl_entity() since there is already one in
>>>> enqueue_dl_entity().
>>>>
>>>> Move the check that the dl_se is not on the dl_rq from
>>>> __dequeue_dl_entity() to dequeue_dl_entity() to align with the enqueue
>>>> side and use the on_dl_rq() helper function.
>>>>
>>>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>>>> ---
>>>>  kernel/sched/deadline.c | 8 +++-----
>>>>  1 file changed, 3 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>>> index 1fa005f79307..a9cb52ceb761 100644
>>>> --- a/kernel/sched/deadline.c
>>>> +++ b/kernel/sched/deadline.c
>>>> @@ -1407,8 +1407,6 @@ static void __enqueue_dl_entity(struct sched_dl_entity *dl_se)
>>>>  	struct sched_dl_entity *entry;
>>>>  	int leftmost = 1;
>>>>  
>>>> -	BUG_ON(!RB_EMPTY_NODE(&dl_se->rb_node));
>>>> -
>>>>  	while (*link) {
>>>>  		parent = *link;
>>>>  		entry = rb_entry(parent, struct sched_dl_entity, rb_node);
>>>> @@ -1430,9 +1428,6 @@ static void __dequeue_dl_entity(struct sched_dl_entity *dl_se)
>>>>  {
>>>>  	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
>>>>  
>>>> -	if (RB_EMPTY_NODE(&dl_se->rb_node))
>>>> -		return;
>>>> -
>>>>  	rb_erase_cached(&dl_se->rb_node, &dl_rq->root);
>>>>  	RB_CLEAR_NODE(&dl_se->rb_node);
>>>>  
>>>> @@ -1466,6 +1461,9 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
>>>>  
>>>>  static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
>>>>  {
>>>> +	if (!on_dl_rq(dl_se))
>>>> +		return;
>>>
>>> Why allow double dequeue instead of WARN?
>>
>> As I was saying to Valentin, it can currently happen that a task could
>> have already been dequeued by update_curr_dl()->throttle called by
>> dequeue_task_dl() before calling __dequeue_task_dl(). Do you think we
>> should check for this condition before calling into dequeue_dl_entity()?
> 
> Yes, that's what ->dl_throttled is for, right? And !->dl_throttled &&
> !on_dl_rq() is a BUG.

OK, I will add the following snippet to the patch.
Although it's easy to provoke a situation in which DL tasks are throttled,
I haven't seen a throttling happening when the task is being dequeued.

--->8---

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b6d2f263e0a4..a009762097fa 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1507,8 +1507,7 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se,
 
 static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
 {
-       if (!on_dl_rq(dl_se))
-               return;
+       BUG_ON(!on_dl_rq(dl_se));
 
        __dequeue_dl_entity(dl_se);
 }
@@ -1592,6 +1591,10 @@ static void __dequeue_task_dl(struct rq *rq, struct task_struct *p)
 static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 {
        update_curr_dl(rq);
+
+       if (p->dl.dl_throttled)
+               return;
+
        __dequeue_task_dl(rq, p);

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-31 17:32         ` Dietmar Eggemann
@ 2019-07-31 20:20           ` luca abeni
  2019-08-01 16:01             ` Dietmar Eggemann
  0 siblings, 1 reply; 31+ messages in thread
From: luca abeni @ 2019-07-31 20:20 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Peter Zijlstra, Juri Lelli, Ingo Molnar,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On Wed, 31 Jul 2019 18:32:47 +0100
Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
[...]
> >>>>  static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
> >>>>  {
> >>>> +	if (!on_dl_rq(dl_se))
> >>>> +		return;  
> >>>
> >>> Why allow double dequeue instead of WARN?  
> >>
> >> As I was saying to Valentin, it can currently happen that a task
> >> could have already been dequeued by update_curr_dl()->throttle
> >> called by dequeue_task_dl() before calling __dequeue_task_dl(). Do
> >> you think we should check for this condition before calling into
> >> dequeue_dl_entity()?  
> > 
> > Yes, that's what ->dl_throttled is for, right? And !->dl_throttled
> > && !on_dl_rq() is a BUG.  
> 
> OK, I will add the following snippet to the patch.
> Although it's easy to provoke a situation in which DL tasks are
> throttled, I haven't seen a throttling happening when the task is
> being dequeued.

This is a not-so-common situation, that can happen with periodic tasks
(a-la rt-app) blocking on clock_nanosleep() (or similar) after
executing for an amount of time comparable with the SCHED_DEADLINE
runtime.

It might happen that the task consumed a little bit more than the
remaining runtime (but has not been throttled yet, because the
accounting happens at every tick)... So, when dequeue_task_dl() invokes
update_task_dl() the runtime becomes negative and the task is throttled.

This happens infrequently, but if you try rt-app tasksets with multiple
tasks and execution times near to the runtime you will see it
happening, sooner or later.


[...]
> @@ -1592,6 +1591,10 @@ static void __dequeue_task_dl(struct rq *rq,
> struct task_struct *p) static void dequeue_task_dl(struct rq *rq,
> struct task_struct *p, int flags) {
>         update_curr_dl(rq);
> +
> +       if (p->dl.dl_throttled)
> +               return;

Sorry, I missed part of the previous discussion, so maybe I am missing
something... But I suspect this "return" might be wrong (you risk to
miss a call to task_non_contending(), coming later in this function).

Maybe you cound use
	if (!p->dl_throttled)
		__dequeue_task_dl(rq, p)

Or did I misunderstand something?



			Thanks,
				Luca

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling
  2019-07-31 20:20           ` luca abeni
@ 2019-08-01 16:01             ` Dietmar Eggemann
  0 siblings, 0 replies; 31+ messages in thread
From: Dietmar Eggemann @ 2019-08-01 16:01 UTC (permalink / raw)
  To: luca abeni
  Cc: Peter Zijlstra, Juri Lelli, Ingo Molnar,
	Daniel Bristot de Oliveira, Valentin Schneider, Qais Yousef,
	linux-kernel

On 7/31/19 9:20 PM, luca abeni wrote:
> On Wed, 31 Jul 2019 18:32:47 +0100
> Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> [...]
>>>>>>  static void dequeue_dl_entity(struct sched_dl_entity *dl_se)
>>>>>>  {
>>>>>> +	if (!on_dl_rq(dl_se))
>>>>>> +		return;  
>>>>>
>>>>> Why allow double dequeue instead of WARN?  
>>>>
>>>> As I was saying to Valentin, it can currently happen that a task
>>>> could have already been dequeued by update_curr_dl()->throttle
>>>> called by dequeue_task_dl() before calling __dequeue_task_dl(). Do
>>>> you think we should check for this condition before calling into
>>>> dequeue_dl_entity()?  
>>>
>>> Yes, that's what ->dl_throttled is for, right? And !->dl_throttled
>>> && !on_dl_rq() is a BUG.  
>>
>> OK, I will add the following snippet to the patch.
>> Although it's easy to provoke a situation in which DL tasks are
>> throttled, I haven't seen a throttling happening when the task is
>> being dequeued.
> 
> This is a not-so-common situation, that can happen with periodic tasks
> (a-la rt-app) blocking on clock_nanosleep() (or similar) after
> executing for an amount of time comparable with the SCHED_DEADLINE
> runtime.
> 
> It might happen that the task consumed a little bit more than the
> remaining runtime (but has not been throttled yet, because the
> accounting happens at every tick)... So, when dequeue_task_dl() invokes
> update_task_dl() the runtime becomes negative and the task is throttled.
> 
> This happens infrequently, but if you try rt-app tasksets with multiple
> tasks and execution times near to the runtime you will see it
> happening, sooner or later.
> 
> 
> [...]
>> @@ -1592,6 +1591,10 @@ static void __dequeue_task_dl(struct rq *rq,
>> struct task_struct *p) static void dequeue_task_dl(struct rq *rq,
>> struct task_struct *p, int flags) {
>>         update_curr_dl(rq);
>> +
>> +       if (p->dl.dl_throttled)
>> +               return;
> 
> Sorry, I missed part of the previous discussion, so maybe I am missing
> something... But I suspect this "return" might be wrong (you risk to
> miss a call to task_non_contending(), coming later in this function).
> 
> Maybe you cound use
> 	if (!p->dl_throttled)
> 		__dequeue_task_dl(rq, p)
> 

I see. With the following rt-app file on h960 (8 CPUs) I'm able to
recreate the situation relatively frequently.

...
"tasks" : {
 "thread0" : {
  "instance" : 12,
  "run" : 11950,
  "timer" : { "ref" : "unique", "period" : 100000, "mode" : "absolute"},
  "dl-runtime" : 12000,
  "dl-period" : 100000,
  "dl-deadline" : 100000
 }
}

...
[ 1912.086664] CPU1: p=[thread0-9 3070] throttled p->on_rq=0 flags=0x9
[ 1912.086726] CPU2: p=[thread0-10 3071] throttled p->on_rq=0 flags=0x9
[ 1924.738912] CPU6: p=[thread0-10 3149] throttled p->on_rq=0 flags=0x9
...

And the flag DEQUEUE_SLEEP is set so like you said
task_non_contending(p) should be called.

I'm going to use your proposal. Thank you for the help!

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-08-01 16:01 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-26  8:27 [PATCH 0/5] sched/deadline: Fix double accounting in push_dl_task() & some cleanups Dietmar Eggemann
2019-07-26  8:27 ` [PATCH 1/5] sched/deadline: Fix double accounting of rq/running bw in push_dl_task() Dietmar Eggemann
2019-07-26 10:11   ` luca abeni
2019-07-29  8:59     ` Dietmar Eggemann
2019-07-29 16:10       ` Peter Zijlstra
2019-07-26 13:30   ` luca abeni
2019-07-29  9:00     ` Dietmar Eggemann
2019-07-31 10:32       ` Dietmar Eggemann
2019-07-26  8:27 ` [PATCH 2/5] sched/deadline: Remove unused int flags from __dequeue_task_dl() Dietmar Eggemann
2019-07-29 16:35   ` Peter Zijlstra
2019-07-29 17:12     ` Dietmar Eggemann
2019-07-26  8:27 ` [PATCH 3/5] sched/deadline: Use __sub_running_bw() throughout dl_change_utilization() Dietmar Eggemann
2019-07-29 16:47   ` Peter Zijlstra
2019-07-29 17:21     ` Dietmar Eggemann
2019-07-26  8:27 ` [PATCH 4/5] sched/deadline: Cleanup on_dl_rq() handling Dietmar Eggemann
2019-07-26  8:37   ` Valentin Schneider
2019-07-26  8:58     ` Qais Yousef
2019-07-26  9:20     ` Juri Lelli
2019-07-26  9:32       ` Valentin Schneider
2019-07-29 16:49   ` Peter Zijlstra
2019-07-30  6:41     ` Juri Lelli
2019-07-30  8:21       ` Peter Zijlstra
2019-07-31 17:32         ` Dietmar Eggemann
2019-07-31 20:20           ` luca abeni
2019-08-01 16:01             ` Dietmar Eggemann
2019-07-26  8:27 ` [PATCH 5/5] sched/deadline: Use return value of SCHED_WARN_ON() in bw accounting Dietmar Eggemann
2019-07-26 10:18   ` luca abeni
2019-07-29 16:54     ` Peter Zijlstra
2019-07-29 16:59       ` Dietmar Eggemann
2019-07-30  8:23         ` Peter Zijlstra
2019-07-30 16:08           ` Dietmar Eggemann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).