linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair : prevent unlimited runtime on throttled group
@ 2020-01-14 14:13 Vincent Guittot
  2020-01-14 18:29 ` bsegall
  2020-01-29 11:32 ` [tip: sched/core] sched/fair: Prevent " tip-bot2 for Vincent Guittot
  0 siblings, 2 replies; 5+ messages in thread
From: Vincent Guittot @ 2020-01-14 14:13 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel
  Cc: Vincent Guittot

When a running task is moved on a throttled task group and there is no
other task enqueued on the CPU, the task can keep running using 100% CPU
whatever the allocated bandwidth for the group and although its cfs rq is
throttled. Furthermore, the group entity of the cfs_rq and its parents are
not enqueued but only set as curr on their respective cfs_rqs.

We have the following sequence:

sched_move_task
  -dequeue_task: dequeue task and group_entities.
  -put_prev_task: put task and group entities.
  -sched_change_group: move task to new group.
  -enqueue_task: enqueue only task but not group entities because cfs_rq is
    throttled.
  -set_next_task : set task and group_entities as current sched_entity of
    their cfs_rq.

Another impact is that the root cfs_rq runnable_load_avg at root rq stays
null because the group_entities are not enqueued. This situation will stay
the same until an "external" event triggers a reschedule. Let trigger it
immediately instead.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/core.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e7b08d52db93..d0acc67336c0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7062,8 +7062,15 @@ void sched_move_task(struct task_struct *tsk)
 
 	if (queued)
 		enqueue_task(rq, tsk, queue_flags);
-	if (running)
+	if (running) {
 		set_next_task(rq, tsk);
+		/*
+		 * After changing group, the running task may have joined a
+		 * throttled one but it's still the running task. Trigger a
+		 * resched to make sure that task can still run.
+		 */
+		resched_curr(rq);
+	}
 
 	task_rq_unlock(rq, tsk, &rf);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair : prevent unlimited runtime on throttled group
  2020-01-14 14:13 [PATCH] sched/fair : prevent unlimited runtime on throttled group Vincent Guittot
@ 2020-01-14 18:29 ` bsegall
  2020-01-20  9:46   ` Peter Zijlstra
  2020-01-29 11:32 ` [tip: sched/core] sched/fair: Prevent " tip-bot2 for Vincent Guittot
  1 sibling, 1 reply; 5+ messages in thread
From: bsegall @ 2020-01-14 18:29 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel

Vincent Guittot <vincent.guittot@linaro.org> writes:

> When a running task is moved on a throttled task group and there is no
> other task enqueued on the CPU, the task can keep running using 100% CPU
> whatever the allocated bandwidth for the group and although its cfs rq is
> throttled. Furthermore, the group entity of the cfs_rq and its parents are
> not enqueued but only set as curr on their respective cfs_rqs.
>
> We have the following sequence:
>
> sched_move_task
>   -dequeue_task: dequeue task and group_entities.
>   -put_prev_task: put task and group entities.
>   -sched_change_group: move task to new group.
>   -enqueue_task: enqueue only task but not group entities because cfs_rq is
>     throttled.
>   -set_next_task : set task and group_entities as current sched_entity of
>     their cfs_rq.
>
> Another impact is that the root cfs_rq runnable_load_avg at root rq stays
> null because the group_entities are not enqueued. This situation will stay
> the same until an "external" event triggers a reschedule. Let trigger it
> immediately instead.

Sounds reasonable to me, "moved group" being an explicit resched check
doesn't sound like a problem in general.

>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/core.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index e7b08d52db93..d0acc67336c0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7062,8 +7062,15 @@ void sched_move_task(struct task_struct *tsk)
>  
>  	if (queued)
>  		enqueue_task(rq, tsk, queue_flags);
> -	if (running)
> +	if (running) {
>  		set_next_task(rq, tsk);
> +		/*
> +		 * After changing group, the running task may have joined a
> +		 * throttled one but it's still the running task. Trigger a
> +		 * resched to make sure that task can still run.
> +		 */
> +		resched_curr(rq);
> +	}
>  
>  	task_rq_unlock(rq, tsk, &rf);
>  }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair : prevent unlimited runtime on throttled group
  2020-01-14 18:29 ` bsegall
@ 2020-01-20  9:46   ` Peter Zijlstra
  2020-01-21 18:26     ` bsegall
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2020-01-20  9:46 UTC (permalink / raw)
  To: bsegall
  Cc: Vincent Guittot, mingo, juri.lelli, dietmar.eggemann, rostedt,
	mgorman, linux-kernel

On Tue, Jan 14, 2020 at 10:29:43AM -0800, bsegall@google.com wrote:
> Vincent Guittot <vincent.guittot@linaro.org> writes:
> 
> > When a running task is moved on a throttled task group and there is no
> > other task enqueued on the CPU, the task can keep running using 100% CPU
> > whatever the allocated bandwidth for the group and although its cfs rq is
> > throttled. Furthermore, the group entity of the cfs_rq and its parents are
> > not enqueued but only set as curr on their respective cfs_rqs.
> >
> > We have the following sequence:
> >
> > sched_move_task
> >   -dequeue_task: dequeue task and group_entities.
> >   -put_prev_task: put task and group entities.
> >   -sched_change_group: move task to new group.
> >   -enqueue_task: enqueue only task but not group entities because cfs_rq is
> >     throttled.
> >   -set_next_task : set task and group_entities as current sched_entity of
> >     their cfs_rq.
> >
> > Another impact is that the root cfs_rq runnable_load_avg at root rq stays
> > null because the group_entities are not enqueued. This situation will stay
> > the same until an "external" event triggers a reschedule. Let trigger it
> > immediately instead.
> 
> Sounds reasonable to me, "moved group" being an explicit resched check
> doesn't sound like a problem in general.

Do I read that as an Ack from you Ben? :-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair : prevent unlimited runtime on throttled group
  2020-01-20  9:46   ` Peter Zijlstra
@ 2020-01-21 18:26     ` bsegall
  0 siblings, 0 replies; 5+ messages in thread
From: bsegall @ 2020-01-21 18:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: bsegall, Vincent Guittot, mingo, juri.lelli, dietmar.eggemann,
	rostedt, mgorman, linux-kernel

Peter Zijlstra <peterz@infradead.org> writes:

> On Tue, Jan 14, 2020 at 10:29:43AM -0800, bsegall@google.com wrote:
>> Vincent Guittot <vincent.guittot@linaro.org> writes:
>> 
>> > When a running task is moved on a throttled task group and there is no
>> > other task enqueued on the CPU, the task can keep running using 100% CPU
>> > whatever the allocated bandwidth for the group and although its cfs rq is
>> > throttled. Furthermore, the group entity of the cfs_rq and its parents are
>> > not enqueued but only set as curr on their respective cfs_rqs.
>> >
>> > We have the following sequence:
>> >
>> > sched_move_task
>> >   -dequeue_task: dequeue task and group_entities.
>> >   -put_prev_task: put task and group entities.
>> >   -sched_change_group: move task to new group.
>> >   -enqueue_task: enqueue only task but not group entities because cfs_rq is
>> >     throttled.
>> >   -set_next_task : set task and group_entities as current sched_entity of
>> >     their cfs_rq.
>> >
>> > Another impact is that the root cfs_rq runnable_load_avg at root rq stays
>> > null because the group_entities are not enqueued. This situation will stay
>> > the same until an "external" event triggers a reschedule. Let trigger it
>> > immediately instead.
>> 
>> Sounds reasonable to me, "moved group" being an explicit resched check
>> doesn't sound like a problem in general.
>
> Do I read that as an Ack from you Ben? :-)

Yeah,

Acked-by: Ben Segall <bsegall@google.com>

The only question I see is if we care about avoiding the overhead for
non-cfsb cases, but cgroup attach is already slow enough that it's
probably not a real problem, and it's reasonable to check if it's still
right to run this task in general.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: sched/core] sched/fair: Prevent unlimited runtime on throttled group
  2020-01-14 14:13 [PATCH] sched/fair : prevent unlimited runtime on throttled group Vincent Guittot
  2020-01-14 18:29 ` bsegall
@ 2020-01-29 11:32 ` tip-bot2 for Vincent Guittot
  1 sibling, 0 replies; 5+ messages in thread
From: tip-bot2 for Vincent Guittot @ 2020-01-29 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vincent Guittot, Peter Zijlstra (Intel),
	Ingo Molnar, Ben Segall, x86, LKML

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     2a4b03ffc69f2dedc6388e9a6438b5f4c133a40d
Gitweb:        https://git.kernel.org/tip/2a4b03ffc69f2dedc6388e9a6438b5f4c133a40d
Author:        Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate:    Tue, 14 Jan 2020 15:13:56 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 28 Jan 2020 21:36:58 +01:00

sched/fair: Prevent unlimited runtime on throttled group

When a running task is moved on a throttled task group and there is no
other task enqueued on the CPU, the task can keep running using 100% CPU
whatever the allocated bandwidth for the group and although its cfs rq is
throttled. Furthermore, the group entity of the cfs_rq and its parents are
not enqueued but only set as curr on their respective cfs_rqs.

We have the following sequence:

sched_move_task
  -dequeue_task: dequeue task and group_entities.
  -put_prev_task: put task and group entities.
  -sched_change_group: move task to new group.
  -enqueue_task: enqueue only task but not group entities because cfs_rq is
    throttled.
  -set_next_task : set task and group_entities as current sched_entity of
    their cfs_rq.

Another impact is that the root cfs_rq runnable_load_avg at root rq stays
null because the group_entities are not enqueued. This situation will stay
the same until an "external" event triggers a reschedule. Let trigger it
immediately instead.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ben Segall <bsegall@google.com>
Link: https://lkml.kernel.org/r/1579011236-31256-1-git-send-email-vincent.guittot@linaro.org
---
 kernel/sched/core.c |  9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a8a5d5b..89e54f3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7072,8 +7072,15 @@ void sched_move_task(struct task_struct *tsk)
 
 	if (queued)
 		enqueue_task(rq, tsk, queue_flags);
-	if (running)
+	if (running) {
 		set_next_task(rq, tsk);
+		/*
+		 * After changing group, the running task may have joined a
+		 * throttled one but it's still the running task. Trigger a
+		 * resched to make sure that task can still run.
+		 */
+		resched_curr(rq);
+	}
 
 	task_rq_unlock(rq, tsk, &rf);
 }

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-01-29 11:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 14:13 [PATCH] sched/fair : prevent unlimited runtime on throttled group Vincent Guittot
2020-01-14 18:29 ` bsegall
2020-01-20  9:46   ` Peter Zijlstra
2020-01-21 18:26     ` bsegall
2020-01-29 11:32 ` [tip: sched/core] sched/fair: Prevent " tip-bot2 for Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).