linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
@ 2013-07-23 18:16 Serge Hallyn
  2013-07-23 18:18 ` [RFC PATCH 2/2] capabilities: allow nice if we are privileged Serge Hallyn
  2013-07-23 18:30 ` [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Tejun Heo
  0 siblings, 2 replies; 14+ messages in thread
From: Serge Hallyn @ 2013-07-23 18:16 UTC (permalink / raw)
  To: ebiederm; +Cc: linux-kernel, containers

We allow a task to change its own devices cgroup, or to change other tasks'
cgroups if it has CAP_SYS_ADMIN.

Also allow task A to change task B's cgroup if task A has CAP_SYS_ADMIN
with respect to task B - meaning A is root in the same userns, or A
created B's userns.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
---
 security/device_cgroup.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 9760ecb6..8f5386ea 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -74,11 +74,14 @@ struct cgroup_subsys devices_subsys;
 static int devcgroup_can_attach(struct cgroup *new_cgrp,
 				struct cgroup_taskset *set)
 {
-	struct task_struct *task = cgroup_taskset_first(set);
+	struct task_struct *t = cgroup_taskset_first(set);
+	int ret = 0;
 
-	if (current != task && !capable(CAP_SYS_ADMIN))
-		return -EPERM;
-	return 0;
+	rcu_read_lock();
+	if (current != t && !ns_capable(__task_cred(t)->user_ns, CAP_SYS_ADMIN))
+		ret = -EPERM;
+	rcu_read_unlock();
+	return ret;
 }
 
 /*
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/2] capabilities: allow nice if we are privileged
  2013-07-23 18:16 [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Serge Hallyn
@ 2013-07-23 18:18 ` Serge Hallyn
  2013-07-24  8:07   ` Eric W. Biederman
  2013-07-23 18:30 ` [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Tejun Heo
  1 sibling, 1 reply; 14+ messages in thread
From: Serge Hallyn @ 2013-07-23 18:18 UTC (permalink / raw)
  To: ebiederm; +Cc: containers, linux-kernel

We allow task A to change B's nice level if it has a supserset of
B's privileges, or of it has CAP_SYS_NICE.  Also allow it if A has
CAP_SYS_NICE with respect to B - meaning it is root in the same
namespace, or it created B's namespace.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
---
 security/commoncap.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/commoncap.c b/security/commoncap.c
index d78b003..ef98b56 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -768,16 +768,16 @@ int cap_task_fix_setuid(struct cred *new, const struct cred *old, int flags)
  */
 static int cap_safe_nice(struct task_struct *p)
 {
-	int is_subset;
+	int is_subset, ret = 0;
 
 	rcu_read_lock();
 	is_subset = cap_issubset(__task_cred(p)->cap_permitted,
 				 current_cred()->cap_permitted);
+	if (!is_subset && !ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE))
+		ret = -EPERM;
 	rcu_read_unlock();
 
-	if (!is_subset && !capable(CAP_SYS_NICE))
-		return -EPERM;
-	return 0;
+	return ret;
 }
 
 /**
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 18:16 [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Serge Hallyn
  2013-07-23 18:18 ` [RFC PATCH 2/2] capabilities: allow nice if we are privileged Serge Hallyn
@ 2013-07-23 18:30 ` Tejun Heo
  2013-07-23 18:38   ` Serge Hallyn
  1 sibling, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2013-07-23 18:30 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: ebiederm, linux-kernel, containers

On Tue, Jul 23, 2013 at 01:16:06PM -0500, Serge Hallyn wrote:
> We allow a task to change its own devices cgroup, or to change other tasks'
> cgroups if it has CAP_SYS_ADMIN.
> 
> Also allow task A to change task B's cgroup if task A has CAP_SYS_ADMIN
> with respect to task B - meaning A is root in the same userns, or A
> created B's userns.

As discussed multpile times, cgroup isn't gonna support delegating
cgroup management directly into containers, so this doesn't really
jive with where we're heading.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 18:30 ` [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Tejun Heo
@ 2013-07-23 18:38   ` Serge Hallyn
  2013-07-23 18:50     ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Serge Hallyn @ 2013-07-23 18:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: ebiederm, linux-kernel, containers

Quoting Tejun Heo (tj@kernel.org):
> On Tue, Jul 23, 2013 at 01:16:06PM -0500, Serge Hallyn wrote:
> > We allow a task to change its own devices cgroup, or to change other tasks'
> > cgroups if it has CAP_SYS_ADMIN.
> > 
> > Also allow task A to change task B's cgroup if task A has CAP_SYS_ADMIN
> > with respect to task B - meaning A is root in the same userns, or A
> > created B's userns.
> 
> As discussed multpile times, cgroup isn't gonna support delegating
> cgroup management directly into containers, so this doesn't really
> jive with where we're heading.

This doesn't delegate it into the container.  It allows me, on the host,
to set the cgroup for a container.

thanks,
-serge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 18:38   ` Serge Hallyn
@ 2013-07-23 18:50     ` Tejun Heo
  2013-07-23 19:04       ` Serge Hallyn
  2013-10-23  0:41       ` Serge E. Hallyn
  0 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2013-07-23 18:50 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Eric W. Biederman, lkml, Containers

On Tue, Jul 23, 2013 at 2:38 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> This doesn't delegate it into the container.  It allows me, on the host,
> to set the cgroup for a container.

Hmmm? I'm a bit confused. Isn't the description saying that the patch
allows pseudo-root in userns to change cgroup membership even if it
isn't actually root?

Besides, I find the whole check rather bogus and would actually much
prefer just nuking the check and just follow the standard permission
checks.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 18:50     ` Tejun Heo
@ 2013-07-23 19:04       ` Serge Hallyn
  2013-07-23 19:12         ` Tejun Heo
  2013-10-23  0:41       ` Serge E. Hallyn
  1 sibling, 1 reply; 14+ messages in thread
From: Serge Hallyn @ 2013-07-23 19:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Eric W. Biederman, lkml, Containers

Quoting Tejun Heo (tj@kernel.org):
> On Tue, Jul 23, 2013 at 2:38 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> > This doesn't delegate it into the container.  It allows me, on the host,
> > to set the cgroup for a container.
> 
> Hmmm? I'm a bit confused. Isn't the description saying that the patch
> allows pseudo-root in userns to change cgroup membership even if it
> isn't actually root?

If task A is uid 1000 on the host, and creates task B as uid X in a new
user namespace, then task A, still being uid 1000 on the host, is
privileged with respect to B and his namespace - i.e.
ns_capable(B->userns, CAP_SYS_ADMIN) is true.

> Besides, I find the whole check rather bogus and would actually much
> prefer just nuking the check and just follow the standard permission
> checks.

I'd be ok with that - but there's one case I'm not sure about:  If PAM
sets me up with /sys/fs/cgroup/devices/serge owned by me, then if I'm
thinking right, removing can_attach would mean I could move init into
/sys/fs/cgroup/devices/serge...

Is there something else stopping that from happening?

-serge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 19:04       ` Serge Hallyn
@ 2013-07-23 19:12         ` Tejun Heo
  2013-07-23 19:28           ` Serge Hallyn
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2013-07-23 19:12 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Eric W. Biederman, lkml, Containers

Hello,

On Tue, Jul 23, 2013 at 02:04:26PM -0500, Serge Hallyn wrote:
> If task A is uid 1000 on the host, and creates task B as uid X in a new
> user namespace, then task A, still being uid 1000 on the host, is
> privileged with respect to B and his namespace - i.e.
> ns_capable(B->userns, CAP_SYS_ADMIN) is true.

Well, that also is the exact type of priv delegation we're moving away
from, so....

> > Besides, I find the whole check rather bogus and would actually much
> > prefer just nuking the check and just follow the standard permission
> > checks.
> 
> I'd be ok with that - but there's one case I'm not sure about:  If PAM
> sets me up with /sys/fs/cgroup/devices/serge owned by me, then if I'm
> thinking right, removing can_attach would mean I could move init into
> /sys/fs/cgroup/devices/serge...
> 
> Is there something else stopping that from happening?

If PAM is giving out perms on cgroup directory, the whole system is
prone to DoS in various ways anyway.  It's already utterly broken, so
kinda moot point.  If there are people actually doing that in the
wild, we can conditionalize it on cgroup_sane_behavior().

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 19:12         ` Tejun Heo
@ 2013-07-23 19:28           ` Serge Hallyn
  2013-07-23 19:39             ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Serge Hallyn @ 2013-07-23 19:28 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Eric W. Biederman, lkml, Containers

Quoting Tejun Heo (tj@kernel.org):
> Hello,
> 
> On Tue, Jul 23, 2013 at 02:04:26PM -0500, Serge Hallyn wrote:
> > If task A is uid 1000 on the host, and creates task B as uid X in a new
> > user namespace, then task A, still being uid 1000 on the host, is
> > privileged with respect to B and his namespace - i.e.
> > ns_capable(B->userns, CAP_SYS_ADMIN) is true.
> 
> Well, that also is the exact type of priv delegation we're moving away
> from, so....

I think that's unreasonable, but I guess I'll have to go reread the
old thread.

> > > Besides, I find the whole check rather bogus and would actually much
> > > prefer just nuking the check and just follow the standard permission
> > > checks.
> > 
> > I'd be ok with that - but there's one case I'm not sure about:  If PAM
> > sets me up with /sys/fs/cgroup/devices/serge owned by me, then if I'm
> > thinking right, removing can_attach would mean I could move init into
> > /sys/fs/cgroup/devices/serge...
> > 
> > Is there something else stopping that from happening?
> 
> If PAM is giving out perms on cgroup directory, the whole system is
> prone to DoS in various ways anyway.  It's already utterly broken, so

If we have decent enforcement of hierarchy for devices.{allow,deny},
which we now do, then I don't see why this has to be the case.

> kinda moot point.  If there are people actually doing that in the
> wild, we can conditionalize it on cgroup_sane_behavior().

Guess we'll stop using cgroups for now.

-serge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 19:28           ` Serge Hallyn
@ 2013-07-23 19:39             ` Tejun Heo
  2013-11-04 21:51               ` Serge E. Hallyn
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2013-07-23 19:39 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: Eric W. Biederman, lkml, Containers

Hello, Serge.

On Tue, Jul 23, 2013 at 3:28 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
>> Well, that also is the exact type of priv delegation we're moving away
>> from, so....
>
> I think that's unreasonable, but I guess I'll have to go reread the
> old thread.

Yeah, please do. I think the case is pretty strong for disallowing
delegation of cgroup directories to !root (or whatever CAP it should
be) users. It's inherently unsafe for some controllers and ends up
leaking kernel implementation details into regular binaries thus
cementing those leaked details as APIs, which is a giant no-no.

> If we have decent enforcement of hierarchy for devices.{allow,deny},
> which we now do, then I don't see why this has to be the case.

If you think about devcg in isolation, maybe, but please keep in mind
that devcg itself is already somewhat abusing cgroup and many other
controllers shouldn't be allowed to delegate and we're headed for one
unified hierarchy.

>> kinda moot point.  If there are people actually doing that in the
>> wild, we can conditionalize it on cgroup_sane_behavior().
>
> Guess we'll stop using cgroups for now.

If you're delegating cgroup accesses to !root users, yes, STOP,
please. It'd be doing a lot more harm to the whole kernel and its
maintainability than whatever extra features / benefits it may be
bringing.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 2/2] capabilities: allow nice if we are privileged
  2013-07-23 18:18 ` [RFC PATCH 2/2] capabilities: allow nice if we are privileged Serge Hallyn
@ 2013-07-24  8:07   ` Eric W. Biederman
  0 siblings, 0 replies; 14+ messages in thread
From: Eric W. Biederman @ 2013-07-24  8:07 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: containers, linux-kernel

Serge Hallyn <serge.hallyn@ubuntu.com> writes:

> We allow task A to change B's nice level if it has a supserset of
> B's privileges, or of it has CAP_SYS_NICE.  Also allow it if A has
> CAP_SYS_NICE with respect to B - meaning it is root in the same
> namespace, or it created B's namespace.

This patch looks good to me.

We already have this logic elsewhere in the kernel so I don't expect
this will make things worse.

Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>

> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
> ---
>  security/commoncap.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/security/commoncap.c b/security/commoncap.c
> index d78b003..ef98b56 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -768,16 +768,16 @@ int cap_task_fix_setuid(struct cred *new, const struct cred *old, int flags)
>   */
>  static int cap_safe_nice(struct task_struct *p)
>  {
> -	int is_subset;
> +	int is_subset, ret = 0;
>  
>  	rcu_read_lock();
>  	is_subset = cap_issubset(__task_cred(p)->cap_permitted,
>  				 current_cred()->cap_permitted);
> +	if (!is_subset && !ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE))
> +		ret = -EPERM;
>  	rcu_read_unlock();
>  
> -	if (!is_subset && !capable(CAP_SYS_NICE))
> -		return -EPERM;
> -	return 0;
> +	return ret;
>  }
>  
>  /**

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 18:50     ` Tejun Heo
  2013-07-23 19:04       ` Serge Hallyn
@ 2013-10-23  0:41       ` Serge E. Hallyn
  2013-10-24 10:57         ` Tejun Heo
  1 sibling, 1 reply; 14+ messages in thread
From: Serge E. Hallyn @ 2013-10-23  0:41 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Serge Hallyn, Containers, Eric W. Biederman, lkml

Quoting Tejun Heo (tj@kernel.org):
> On Tue, Jul 23, 2013 at 2:38 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> > This doesn't delegate it into the container.  It allows me, on the host,
> > to set the cgroup for a container.
> 
> Hmmm? I'm a bit confused. Isn't the description saying that the patch
> allows pseudo-root in userns to change cgroup membership even if it
> isn't actually root?
> 
> Besides, I find the whole check rather bogus and would actually much
> prefer just nuking the check and just follow the standard permission
> checks.

Can we please nuke it like this then?

>From b840083ec8fa1f0645ae925c79db3dc51edd019c Mon Sep 17 00:00:00 2001
From: Serge Hallyn <serge.hallyn@ubuntu.com>
Date: Wed, 23 Oct 2013 01:34:00 +0200
Subject: [PATCH 1/1] device_cgroup: remove can_attach

It is really only wanting to duplicate a check which is already done by the
cgroup subsystem.

With this patch, user jdoe still cannot move pid 1 into a devices cgroup
he owns, but now he can move his own other tasks into devices cgroups.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
---
 security/device_cgroup.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index efc6f68..a37c054 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -64,16 +64,6 @@ static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 
 struct cgroup_subsys devices_subsys;
 
-static int devcgroup_can_attach(struct cgroup_subsys_state *new_css,
-				struct cgroup_taskset *set)
-{
-	struct task_struct *task = cgroup_taskset_first(set);
-
-	if (current != task && !capable(CAP_SYS_ADMIN))
-		return -EPERM;
-	return 0;
-}
-
 /*
  * called under devcgroup_mutex
  */
@@ -698,7 +688,6 @@ static struct cftype dev_cgroup_files[] = {
 
 struct cgroup_subsys devices_subsys = {
 	.name = "devices",
-	.can_attach = devcgroup_can_attach,
 	.css_alloc = devcgroup_css_alloc,
 	.css_free = devcgroup_css_free,
 	.css_online = devcgroup_online,
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-10-23  0:41       ` Serge E. Hallyn
@ 2013-10-24 10:57         ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2013-10-24 10:57 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Serge Hallyn, Containers, Eric W. Biederman, lkml

On Wed, Oct 23, 2013 at 12:41:30AM +0000, Serge E. Hallyn wrote:
> Quoting Tejun Heo (tj@kernel.org):
> > On Tue, Jul 23, 2013 at 2:38 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> > > This doesn't delegate it into the container.  It allows me, on the host,
> > > to set the cgroup for a container.
> > 
> > Hmmm? I'm a bit confused. Isn't the description saying that the patch
> > allows pseudo-root in userns to change cgroup membership even if it
> > isn't actually root?
> > 
> > Besides, I find the whole check rather bogus and would actually much
> > prefer just nuking the check and just follow the standard permission
> > checks.
> 
> Can we please nuke it like this then?
> 
> From b840083ec8fa1f0645ae925c79db3dc51edd019c Mon Sep 17 00:00:00 2001
> From: Serge Hallyn <serge.hallyn@ubuntu.com>
> Date: Wed, 23 Oct 2013 01:34:00 +0200
> Subject: [PATCH 1/1] device_cgroup: remove can_attach
> 
> It is really only wanting to duplicate a check which is already done by the
> cgroup subsystem.
> 
> With this patch, user jdoe still cannot move pid 1 into a devices cgroup
> he owns, but now he can move his own other tasks into devices cgroups.
> 
> Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Tejun Heo <tj@kernel.org>

Applied to cgroup/for-3.13.  Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-07-23 19:39             ` Tejun Heo
@ 2013-11-04 21:51               ` Serge E. Hallyn
  2013-11-04 22:06                 ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Serge E. Hallyn @ 2013-11-04 21:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Serge Hallyn, Containers, Eric W. Biederman, lkml, Victor Marmol,
	Stéphane Graber, Rohit Jnagal

Quoting Tejun Heo (tj@kernel.org):
> Hello, Serge.
> 
> On Tue, Jul 23, 2013 at 3:28 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> > On Tue, Jul 23, 2013 at 02:04:26PM -0500, Serge Hallyn wrote:
> > > If task A is uid 1000 on the host, and creates task B as uid X in a new
> > > user namespace, then task A, still being uid 1000 on the host, is
> > > privileged with respect to B and his namespace - i.e.
> > > ns_capable(B->userns, CAP_SYS_ADMIN) is true.
> 
> >> Well, that also is the exact type of priv delegation we're moving away
> >> from, so....
> >
> > I think that's unreasonable, but I guess I'll have to go reread the
> > old thread.
> 
> Yeah, please do. I think the case is pretty strong for disallowing
> delegation of cgroup directories to !root (or whatever CAP it should
> be) users. It's inherently unsafe for some controllers and ends up
> leaking kernel implementation details into regular binaries thus
> cementing those leaked details as APIs, which is a giant no-no.

Hi Tejun,

So to come back to this...  At plumbers, I believe you were not present
at the 'let me contain that for you' session.  There the "delegation is
not safe" topic came up.  When pressed, the only example that came up
was (IIRC) having to do with an rt scheduling issue which, all agreed,
was a bug in the implementation which should be fixed, rather than a valid
reason to say that delegation of cgroups should be fundamentally not supported.

Do you have a list of such issues which you see with delegation?  That is,
cases where, if ownership of a subtree is granted to a non-root user,
that user can affect tasks owned by other users who are in other
cgroups?

-serge

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable
  2013-11-04 21:51               ` Serge E. Hallyn
@ 2013-11-04 22:06                 ` Tejun Heo
  0 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2013-11-04 22:06 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Serge Hallyn, Containers, Eric W. Biederman, lkml, Victor Marmol,
	Stéphane Graber, Rohit Jnagal

Hello,

On Mon, Nov 04, 2013 at 09:51:35PM +0000, Serge E. Hallyn wrote:
> Do you have a list of such issues which you see with delegation?  That is,
> cases where, if ownership of a subtree is granted to a non-root user,
> that user can affect tasks owned by other users who are in other
> cgroups?

A lot of security is about logistics and cgroup simply doesn't have
them - depth, number of cgroups quota, even config changes or
subdirectory operations which involve RCU operations can easily be
used for DoS attacks.  Just think about how much complexity and effort
need to be spent on making and maintaining anything properly
delegatable to !priv users.  cgroup has never spent such design or
implementation effort - e.g. take a look at how event_control thing is
implemented, it's extremely easy to trigger OOM if you give that out
to !priv users.

cgroup has *never* been safe to give out to !priv users and it is
highly unlikely to be in any foreseeable future.  It will be a big new
giant feature which I frankly don't think is worth the risk or effort.
Think of it as giving out sysctl or firewall rule control to !priv
users.  Giving out subset of those controls do make sense in terms of
function but we don't do that and don't have infrastructure to support
such usage.  cgroup at this stage isn't that different.  If you insist
on doing that, you can but it is severely compromising in terms of
security and it'll stay that way for the foreseeable future.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-11-04 22:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-23 18:16 [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Serge Hallyn
2013-07-23 18:18 ` [RFC PATCH 2/2] capabilities: allow nice if we are privileged Serge Hallyn
2013-07-24  8:07   ` Eric W. Biederman
2013-07-23 18:30 ` [RFC PATCH 1/2] devices cgroup: allow can_attach() if ns_capable Tejun Heo
2013-07-23 18:38   ` Serge Hallyn
2013-07-23 18:50     ` Tejun Heo
2013-07-23 19:04       ` Serge Hallyn
2013-07-23 19:12         ` Tejun Heo
2013-07-23 19:28           ` Serge Hallyn
2013-07-23 19:39             ` Tejun Heo
2013-11-04 21:51               ` Serge E. Hallyn
2013-11-04 22:06                 ` Tejun Heo
2013-10-23  0:41       ` Serge E. Hallyn
2013-10-24 10:57         ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).