All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] User namespace: don't allow sysctl in non-init user ns
       [not found] <xs4all.20110915194812.GA24348@sergelap>
@ 2011-09-21  9:46 ` Miquel van Smoorenburg
  2011-09-21 13:15   ` Serge E. Hallyn
  0 siblings, 1 reply; 3+ messages in thread
From: Miquel van Smoorenburg @ 2011-09-21  9:46 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: linux-kernel

On Thu, 2011-09-15 at 14:48 -0500, Serge E. Hallyn wrote:
> sysctl.c has its own custom uid check, which is not user namespace
> aware.  As discovered by Richard, that allows root in a container
> privileged access to set all sysctls.
> 
> To fix that, just refuse access if current is not in init_user_ns.  We
> may at some point want to relax that check so that some sysctls are
> allowed - for instance dmesg_restrict when syslog is containerized.
> 
> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: Vasiliy Kulikov <segoon@openwall.com>
> Cc: richard@nod.at
> ---
>  kernel/sysctl.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 11d65b5..f2b42e2 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -1697,6 +1697,8 @@ void register_sysctl_root(struct ctl_table_root *root)
>  
>  static int test_perm(int mode, int op)
>  {
> +	if (current_user_ns() != &init_user_ns)
> +		return -EACCES;
>  	if (!current_euid())
>  		mode >>= 6;
>  	else if (in_egroup_p(0))

I haven't tested it, but it looks like this denies access to /proc/sys
completely, right ? Wouldn't it be better to make access read-only ? 
For example glibc reads from several /proc/sys/... files for sysconf()
and pathconf(). That would fail with this patch I think ?

Something like

--- a/kernel/sysctl.c.orig	2011-06-24 00:24:26.000000000 +0200
+++ b/kernel/sysctl.c	2011-09-21 11:40:42.961291629 +0200
@@ -1892,10 +1892,15 @@
 
 static int test_perm(int mode, int op)
 {
-	if (!current_euid())
-		mode >>= 6;
-	else if (in_egroup_p(0))
-		mode >>= 3;
+	if (current_user_ns() != &init_user_ns) {
+		if (op & MAY_WRITE)
+			return -EACCES;
+	} else {
+		if (!current_euid())
+			mode >>= 6;
+		else if (in_egroup_p(0))
+			mode >>= 3;
+	}
 	if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0)
 		return 0;
 	return -EACCES;

Mike.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] User namespace: don't allow sysctl in non-init user ns
  2011-09-21  9:46 ` [PATCH] User namespace: don't allow sysctl in non-init user ns Miquel van Smoorenburg
@ 2011-09-21 13:15   ` Serge E. Hallyn
  2011-09-23  1:40     ` [PATCH] User namespace: don't allow sysctl in non-init user ns (v2) Serge E. Hallyn
  0 siblings, 1 reply; 3+ messages in thread
From: Serge E. Hallyn @ 2011-09-21 13:15 UTC (permalink / raw)
  To: Miquel van Smoorenburg; +Cc: linux-kernel

Quoting Miquel van Smoorenburg (mikevs@xs4all.net):
> On Thu, 2011-09-15 at 14:48 -0500, Serge E. Hallyn wrote:
> > sysctl.c has its own custom uid check, which is not user namespace
> > aware.  As discovered by Richard, that allows root in a container
> > privileged access to set all sysctls.
> > 
> > To fix that, just refuse access if current is not in init_user_ns.  We
> > may at some point want to relax that check so that some sysctls are
> > allowed - for instance dmesg_restrict when syslog is containerized.
> > 
> > Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
> > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > Cc: Vasiliy Kulikov <segoon@openwall.com>
> > Cc: richard@nod.at
> > ---
> >  kernel/sysctl.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index 11d65b5..f2b42e2 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -1697,6 +1697,8 @@ void register_sysctl_root(struct ctl_table_root *root)
> >  
> >  static int test_perm(int mode, int op)
> >  {
> > +	if (current_user_ns() != &init_user_ns)
> > +		return -EACCES;
> >  	if (!current_euid())
> >  		mode >>= 6;
> >  	else if (in_egroup_p(0))
> 
> I haven't tested it, but it looks like this denies access to /proc/sys
> completely, right ?

True.  Good point.

> Wouldn't it be better to make access read-only ? 

It'd be better, yes.  For the moment we're trying to focus on making sure
there are no leaks out of the user namespace, so that then we can start
relaxing.  But I think you're right, this is easy enough to do right
from the start.

> For example glibc reads from several /proc/sys/... files for sysconf()
> and pathconf(). That would fail with this patch I think ?
> 
> Something like
> 
> --- a/kernel/sysctl.c.orig	2011-06-24 00:24:26.000000000 +0200
> +++ b/kernel/sysctl.c	2011-09-21 11:40:42.961291629 +0200
> @@ -1892,10 +1892,15 @@
>  
>  static int test_perm(int mode, int op)
>  {
> -	if (!current_euid())
> -		mode >>= 6;
> -	else if (in_egroup_p(0))
> -		mode >>= 3;
> +	if (current_user_ns() != &init_user_ns) {
> +		if (op & MAY_WRITE)
> +			return -EACCES;
> +	} else {
> +		if (!current_euid())
> +			mode >>= 6;
> +		else if (in_egroup_p(0))
> +			mode >>= 3;
> +	}
>  	if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0)
>  		return 0;
>  	return -EACCES;

How about the following, slightly changed?  It tries to more precisely
follow the rule that access from another (non-ancestor) user-ns is
checked as access from the overflow userid (-1).

	if (current_user_ns() == &init_user_ns) {
		if (!current_euid())
			mode >>= 6;
		else if (in_egroup_p(0))
			mode >>= 3;
	}
  	if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0)
  		return 0;
  	return -EACCES;

Does that make sense?  Yes it does mean that any world-writeable
files are writeable from a container, but those are the rules
we'd earlier defined.  (File creation (directory write) has to be
a special exception only because we don't have a valid owner to
assign to the new file).

Thanks for pointing this out!

-serge

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] User namespace: don't allow sysctl in non-init user ns (v2)
  2011-09-21 13:15   ` Serge E. Hallyn
@ 2011-09-23  1:40     ` Serge E. Hallyn
  0 siblings, 0 replies; 3+ messages in thread
From: Serge E. Hallyn @ 2011-09-23  1:40 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Miquel van Smoorenburg, linux-kernel, Eric W. Biederman, richard, akpm

sysctl.c has its own custom uid check, which is not user namespace
aware.  As discovered by Richard, that allows root in a container
privileged access to set all sysctls.

To fix that, don't compare uid or groups if current is not in the
initial user namespace.  We may at some point want to relax that check
so that some sysctls are allowed - for instance dmesg_restrict when
syslog is containerized.

Changelog:
Sep 22: As Miquel van Smoorenburg pointed out, rather than always
	refusing access if not in initial user_ns, we should allow
	world access rights to sysctl files.  We just want to prevent
	a task in a non-init user namespace from getting the root user
	or group access rights.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: richard@nod.at
Cc: Miquel van Smoorenburg <mikevs@xs4all.net>
---
 kernel/sysctl.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 11d65b5..95988dc 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1697,10 +1697,12 @@ void register_sysctl_root(struct ctl_table_root *root)
 
 static int test_perm(int mode, int op)
 {
-	if (!current_euid())
-		mode >>= 6;
-	else if (in_egroup_p(0))
-		mode >>= 3;
+	if (current_user_ns() == &init_user_ns) {
+		if (!current_euid())
+			mode >>= 6;
+		else if (in_egroup_p(0))
+			mode >>= 3;
+	}
 	if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0)
 		return 0;
 	return -EACCES;
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-09-23  1:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <xs4all.20110915194812.GA24348@sergelap>
2011-09-21  9:46 ` [PATCH] User namespace: don't allow sysctl in non-init user ns Miquel van Smoorenburg
2011-09-21 13:15   ` Serge E. Hallyn
2011-09-23  1:40     ` [PATCH] User namespace: don't allow sysctl in non-init user ns (v2) Serge E. Hallyn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.