linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
@ 2018-01-22 19:26 Tejun Heo
  2018-01-29 19:15 ` Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Tejun Heo @ 2018-01-22 19:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, cgroups, Li Zefan, Johannes Weiner

While adding cgroup2 interface for the cpu controller, 0d5936344f30
("sched: Implement interface for cgroup unified hierarchy") forgot to
update input validation and left it to reject cpu.max config if any
descendant has set a higher value.

cgroup2 officially supports delegation and a descendant must not be
able to restrict what its ancestors can configure.  For absolute
limits such as cpu.max and memory.max, this means that the config at
each level should only act as the upper limit at that level and
shouldn't interfere with what other cgroups can configure.

This patch updates config validation on cgroup2 so that the cpu
controller follows the same convention.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy")
---
 kernel/sched/core.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a7bf32a..e06e484 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6610,13 +6610,18 @@ static int tg_cfs_schedulable_down(struct task_group *tg, void *data)
 		parent_quota = parent_b->hierarchical_quota;
 
 		/*
-		 * Ensure max(child_quota) <= parent_quota, inherit when no
+		 * Ensure max(child_quota) <= parent_quota.  On cgroup2,
+		 * always take the min.  On cgroup1, only inherit when no
 		 * limit is set:
 		 */
-		if (quota == RUNTIME_INF)
-			quota = parent_quota;
-		else if (parent_quota != RUNTIME_INF && quota > parent_quota)
-			return -EINVAL;
+		if (cgroup_subsys_on_dfl(cpu_cgrp_subsys)) {
+			quota = min(quota, parent_quota);
+		} else {
+			if (quota == RUNTIME_INF)
+				quota = parent_quota;
+			else if (parent_quota != RUNTIME_INF && quota > parent_quota)
+				return -EINVAL;
+		}
 	}
 	cfs_b->hierarchical_quota = quota;
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
  2018-01-22 19:26 [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors Tejun Heo
@ 2018-01-29 19:15 ` Tejun Heo
  2018-01-30 10:21 ` Peter Zijlstra
  2018-02-12 17:24 ` Tejun Heo
  2 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2018-01-29 19:15 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, cgroups, Li Zefan, Johannes Weiner

Hello,

On Mon, Jan 22, 2018 at 11:26:18AM -0800, Tejun Heo wrote:
> While adding cgroup2 interface for the cpu controller, 0d5936344f30
> ("sched: Implement interface for cgroup unified hierarchy") forgot to
> update input validation and left it to reject cpu.max config if any
> descendant has set a higher value.
> 
> cgroup2 officially supports delegation and a descendant must not be
> able to restrict what its ancestors can configure.  For absolute
> limits such as cpu.max and memory.max, this means that the config at
> each level should only act as the upper limit at that level and
> shouldn't interfere with what other cgroups can configure.
> 
> This patch updates config validation on cgroup2 so that the cpu
> controller follows the same convention.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Fixes: 0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy")

Peter, can you please take a look?  I can route it through cgroup tree
if that's preferable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
  2018-01-22 19:26 [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors Tejun Heo
  2018-01-29 19:15 ` Tejun Heo
@ 2018-01-30 10:21 ` Peter Zijlstra
  2018-01-30 14:56   ` Tejun Heo
  2018-02-12 17:24 ` Tejun Heo
  2 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2018-01-30 10:21 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel, cgroups, Li Zefan, Johannes Weiner

On Mon, Jan 22, 2018 at 11:26:18AM -0800, Tejun Heo wrote:
> While adding cgroup2 interface for the cpu controller, 0d5936344f30
> ("sched: Implement interface for cgroup unified hierarchy") forgot to
> update input validation and left it to reject cpu.max config if any
> descendant has set a higher value.
> 
> cgroup2 officially supports delegation and a descendant must not be
> able to restrict what its ancestors can configure.  For absolute
> limits such as cpu.max and memory.max, this means that the config at
> each level should only act as the upper limit at that level and
> shouldn't interfere with what other cgroups can configure.

*blink* what?

afaiu the existing code does exactly the opposite, it forces the
descendants to configure less than the parent allows.

You're taking out an error condition and silently allowing descentant
misconfiguration. How does that make sense?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
  2018-01-30 10:21 ` Peter Zijlstra
@ 2018-01-30 14:56   ` Tejun Heo
  2018-02-01 16:49     ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2018-01-30 14:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, cgroups, Li Zefan, Johannes Weiner

Hello,

On Tue, Jan 30, 2018 at 11:21:56AM +0100, Peter Zijlstra wrote:
> afaiu the existing code does exactly the opposite, it forces the
> descendants to configure less than the parent allows.
> 
> You're taking out an error condition and silently allowing descentant
> misconfiguration. How does that make sense?

Well, they're upper limits, not strict allocations.  The current
behavior implemented by cpu isn't either a strict allocation or upper
limits.  It disallows a child from having a value higher than the
parent (allocation-ish) but the sum of the children is allowed to
exceed the parent's (limit-ish).

The combination is rather arbitrary and makes it impossible to
delegate safely (a delegatee can block the delegator from reducing the
amount resource allocated to the delegatee) while not really
protecting against overcommit from descendants either.

We had this sort of input validations in different controllers all in
their own ways.  In most cases, these aren't well thought out and we
can't support things like delegation without aligning controller
behaviors.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
  2018-01-30 14:56   ` Tejun Heo
@ 2018-02-01 16:49     ` Peter Zijlstra
  2018-02-01 19:57       ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2018-02-01 16:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, linux-kernel, cgroups, Li Zefan, Johannes Weiner

On Tue, Jan 30, 2018 at 06:56:39AM -0800, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jan 30, 2018 at 11:21:56AM +0100, Peter Zijlstra wrote:
> > afaiu the existing code does exactly the opposite, it forces the
> > descendants to configure less than the parent allows.
> > 
> > You're taking out an error condition and silently allowing descentant
> > misconfiguration. How does that make sense?
> 
> Well, they're upper limits, not strict allocations.  The current
> behavior implemented by cpu isn't either a strict allocation or upper
> limits.  It disallows a child from having a value higher than the
> parent (allocation-ish) but the sum of the children is allowed to
> exceed the parent's (limit-ish).

True; but its still weird to have the parent 'promise' something and
then retract that 'promise' later.

> The combination is rather arbitrary and makes it impossible to
> delegate safely (a delegatee can block the delegator from reducing the
> amount resource allocated to the delegatee) while not really
> protecting against overcommit from descendants either.
> 
> We had this sort of input validations in different controllers all in
> their own ways.  In most cases, these aren't well thought out and we
> can't support things like delegation without aligning controller
> behaviors.

I suppose.. 

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
  2018-02-01 16:49     ` Peter Zijlstra
@ 2018-02-01 19:57       ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2018-02-01 19:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, cgroups, Li Zefan, Johannes Weiner

Hello,

On Thu, Feb 01, 2018 at 05:49:42PM +0100, Peter Zijlstra wrote:
> > Well, they're upper limits, not strict allocations.  The current
> > behavior implemented by cpu isn't either a strict allocation or upper
> > limits.  It disallows a child from having a value higher than the
> > parent (allocation-ish) but the sum of the children is allowed to
> > exceed the parent's (limit-ish).
> 
> True; but its still weird to have the parent 'promise' something and
> then retract that 'promise' later.

Yeah, depending on how you look at it, it can feel weird.  It's just
that viewing these absolute resource limits (cpu.max,
memory.{high,max}, io.max, pids.max) as upper bounds seems to be the
best abstraction in terms of capturing what they do and making uses of
them in a robust way.

> > We had this sort of input validations in different controllers all in
> > their own ways.  In most cases, these aren't well thought out and we
> > can't support things like delegation without aligning controller
> > behaviors.
> 
> I suppose.. 
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Will route it through cgroup fixes branch in a week or so.

Thanks a lot.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors
  2018-01-22 19:26 [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors Tejun Heo
  2018-01-29 19:15 ` Tejun Heo
  2018-01-30 10:21 ` Peter Zijlstra
@ 2018-02-12 17:24 ` Tejun Heo
  2 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2018-02-12 17:24 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, cgroups, Li Zefan, Johannes Weiner

On Mon, Jan 22, 2018 at 11:26:18AM -0800, Tejun Heo wrote:
> While adding cgroup2 interface for the cpu controller, 0d5936344f30
> ("sched: Implement interface for cgroup unified hierarchy") forgot to
> update input validation and left it to reject cpu.max config if any
> descendant has set a higher value.
> 
> cgroup2 officially supports delegation and a descendant must not be
> able to restrict what its ancestors can configure.  For absolute
> limits such as cpu.max and memory.max, this means that the config at
> each level should only act as the upper limit at that level and
> shouldn't interfere with what other cgroups can configure.
> 
> This patch updates config validation on cgroup2 so that the cpu
> controller follows the same convention.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Fixes: 0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy")

Applied to cgroup/for-4.16-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-02-12 17:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-22 19:26 [PATCH v4.15-rc9] sched, cgroup: Don't reject lower cpu.max on ancestors Tejun Heo
2018-01-29 19:15 ` Tejun Heo
2018-01-30 10:21 ` Peter Zijlstra
2018-01-30 14:56   ` Tejun Heo
2018-02-01 16:49     ` Peter Zijlstra
2018-02-01 19:57       ` Tejun Heo
2018-02-12 17:24 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).