[PATCH] ptrace: being capable wrt a process requires mapped uids/gids

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
@ 2015-12-12 20:12 Jann Horn
  2015-12-15  0:26 ` Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jann Horn @ 2015-12-12 20:12 UTC (permalink / raw)
  To: Roland McGrath, Oleg Nesterov
  Cc: linux-kernel, security, Serge Hallyn, Jann Horn

ptrace_has_cap() checks whether the current process should be
treated as having a certain capability for ptrace checks
against another process. Until now, this was equivalent to
has_ns_capability(current, target_ns, CAP_SYS_PTRACE).

However, if a root-owned process wants to enter a user
namespace for some reason without knowing who owns it and
therefore can't change to the namespace owner's uid and gid
before entering, as soon as it has entered the namespace,
the namespace owner can attach to it via ptrace and thereby
gain access to its uid and gid.

While it is possible for the entering process to switch to
the uid of a claimed namespace owner before entering,
causing the attempt to enter to fail if the claimed uid is
wrong, this doesn't solve the problem of determining an
appropriate gid.

With this change, the entering process can first enter the
namespace and then safely inspect the namespace's
properties, e.g. through /proc/self/{uid_map,gid_map},
assuming that the namespace owner doesn't have access to
uid 0.

Signed-off-by: Jann Horn <jann@thejh.net>
---
 kernel/ptrace.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index b760bae..c27770d 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -207,12 +207,32 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 	return ret;
 }
 
-static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
+static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
 {
+	struct user_namespace *tns = tcred->user_ns;
+	struct user_namespace *curns = current_cred()->user_ns;
+
+	/* When a root-owned process enters a user namespace created by a
+	 * malicious user, the user shouldn't be able to execute code under
+	 * uid 0 by attaching to the root-owned process via ptrace.
+	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
+	 * verify that all the uids and gids of the target process are
+	 * mapped into the current namespace.
+	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
+	 * either.
+	 */
+	if (!kuid_has_mapping(curns, tcred->euid) ||
+			!kuid_has_mapping(curns, tcred->suid) ||
+			!kuid_has_mapping(curns, tcred->uid)  ||
+			!kgid_has_mapping(curns, tcred->egid) ||
+			!kgid_has_mapping(curns, tcred->sgid) ||
+			!kgid_has_mapping(curns, tcred->gid))
+		return false;
+
 	if (mode & PTRACE_MODE_NOAUDIT)
-		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
+		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
 	else
-		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
+		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
 }
 
 /* Returns 0 on success, -errno on denial. */
@@ -241,7 +261,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 	    gid_eq(cred->gid, tcred->sgid) &&
 	    gid_eq(cred->gid, tcred->gid))
 		goto ok;
-	if (ptrace_has_cap(tcred->user_ns, mode))
+	if (ptrace_has_cap(tcred, mode))
 		goto ok;
 	rcu_read_unlock();
 	return -EPERM;
@@ -252,7 +272,7 @@ ok:
 		dumpable = get_dumpable(task->mm);
 	rcu_read_lock();
 	if (dumpable != SUID_DUMP_USER &&
-	    !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
+	    !ptrace_has_cap(__task_cred(task), mode)) {
 		rcu_read_unlock();
 		return -EPERM;
 	}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-12 20:12 [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Jann Horn
@ 2015-12-15  0:26 ` Andy Lutomirski
  2015-12-17 18:59 ` Serge E. Hallyn
  2015-12-26  1:10 ` Jann Horn
  2 siblings, 0 replies; 17+ messages in thread
From: Andy Lutomirski @ 2015-12-15  0:26 UTC (permalink / raw)
  To: Jann Horn, Eric W. Biederman
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security, Serge Hallyn

On Sat, Dec 12, 2015 at 12:12 PM, Jann Horn <jann@thejh.net> wrote:
> ptrace_has_cap() checks whether the current process should be
> treated as having a certain capability for ptrace checks
> against another process. Until now, this was equivalent to
> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
>
> However, if a root-owned process wants to enter a user
> namespace for some reason without knowing who owns it and
> therefore can't change to the namespace owner's uid and gid
> before entering, as soon as it has entered the namespace,
> the namespace owner can attach to it via ptrace and thereby
> gain access to its uid and gid.
>
> While it is possible for the entering process to switch to
> the uid of a claimed namespace owner before entering,
> causing the attempt to enter to fail if the claimed uid is
> wrong, this doesn't solve the problem of determining an
> appropriate gid.
>
> With this change, the entering process can first enter the
> namespace and then safely inspect the namespace's
> properties, e.g. through /proc/self/{uid_map,gid_map},
> assuming that the namespace owner doesn't have access to
> uid 0.
>
> Signed-off-by: Jann Horn <jann@thejh.net>

The analysis and the patch both look correct.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

Eric, care to opine?

--Andy

> ---
>  kernel/ptrace.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index b760bae..c27770d 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -207,12 +207,32 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
>         return ret;
>  }
>
> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
>  {
> +       struct user_namespace *tns = tcred->user_ns;
> +       struct user_namespace *curns = current_cred()->user_ns;
> +
> +       /* When a root-owned process enters a user namespace created by a
> +        * malicious user, the user shouldn't be able to execute code under
> +        * uid 0 by attaching to the root-owned process via ptrace.
> +        * Therefore, similar to the capable_wrt_inode_uidgid() check,
> +        * verify that all the uids and gids of the target process are
> +        * mapped into the current namespace.
> +        * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> +        * either.
> +        */
> +       if (!kuid_has_mapping(curns, tcred->euid) ||
> +                       !kuid_has_mapping(curns, tcred->suid) ||
> +                       !kuid_has_mapping(curns, tcred->uid)  ||
> +                       !kgid_has_mapping(curns, tcred->egid) ||
> +                       !kgid_has_mapping(curns, tcred->sgid) ||
> +                       !kgid_has_mapping(curns, tcred->gid))
> +               return false;
> +
>         if (mode & PTRACE_MODE_NOAUDIT)
> -               return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +               return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
>         else
> -               return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +               return has_ns_capability(current, tns, CAP_SYS_PTRACE);
>  }
>
>  /* Returns 0 on success, -errno on denial. */
> @@ -241,7 +261,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>             gid_eq(cred->gid, tcred->sgid) &&
>             gid_eq(cred->gid, tcred->gid))
>                 goto ok;
> -       if (ptrace_has_cap(tcred->user_ns, mode))
> +       if (ptrace_has_cap(tcred, mode))
>                 goto ok;
>         rcu_read_unlock();
>         return -EPERM;
> @@ -252,7 +272,7 @@ ok:
>                 dumpable = get_dumpable(task->mm);
>         rcu_read_lock();
>         if (dumpable != SUID_DUMP_USER &&
> -           !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> +           !ptrace_has_cap(__task_cred(task), mode)) {
>                 rcu_read_unlock();
>                 return -EPERM;
>         }
> --
> 2.1.4
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-12 20:12 [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Jann Horn
  2015-12-15  0:26 ` Andy Lutomirski
@ 2015-12-17 18:59 ` Serge E. Hallyn
  2015-12-26  1:10 ` Jann Horn
  2 siblings, 0 replies; 17+ messages in thread
From: Serge E. Hallyn @ 2015-12-17 18:59 UTC (permalink / raw)
  To: Jann Horn
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security, Serge Hallyn

On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> ptrace_has_cap() checks whether the current process should be
> treated as having a certain capability for ptrace checks
> against another process. Until now, this was equivalent to
> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
> 
> However, if a root-owned process wants to enter a user
> namespace for some reason without knowing who owns it and
> therefore can't change to the namespace owner's uid and gid
> before entering, as soon as it has entered the namespace,
> the namespace owner can attach to it via ptrace and thereby
> gain access to its uid and gid.
> 
> While it is possible for the entering process to switch to
> the uid of a claimed namespace owner before entering,
> causing the attempt to enter to fail if the claimed uid is
> wrong, this doesn't solve the problem of determining an
> appropriate gid.
> 
> With this change, the entering process can first enter the
> namespace and then safely inspect the namespace's
> properties, e.g. through /proc/self/{uid_map,gid_map},
> assuming that the namespace owner doesn't have access to
> uid 0.
> 
> Signed-off-by: Jann Horn <jann@thejh.net>

Thanks, Jann!  This is indeed pretty high priority.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>

> ---
>  kernel/ptrace.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index b760bae..c27770d 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -207,12 +207,32 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
>  	return ret;
>  }
>  
> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
>  {
> +	struct user_namespace *tns = tcred->user_ns;
> +	struct user_namespace *curns = current_cred()->user_ns;
> +
> +	/* When a root-owned process enters a user namespace created by a
> +	 * malicious user, the user shouldn't be able to execute code under
> +	 * uid 0 by attaching to the root-owned process via ptrace.
> +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> +	 * verify that all the uids and gids of the target process are
> +	 * mapped into the current namespace.
> +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> +	 * either.
> +	 */
> +	if (!kuid_has_mapping(curns, tcred->euid) ||
> +			!kuid_has_mapping(curns, tcred->suid) ||
> +			!kuid_has_mapping(curns, tcred->uid)  ||
> +			!kgid_has_mapping(curns, tcred->egid) ||
> +			!kgid_has_mapping(curns, tcred->sgid) ||
> +			!kgid_has_mapping(curns, tcred->gid))
> +		return false;
> +
>  	if (mode & PTRACE_MODE_NOAUDIT)
> -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
>  	else
> -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
>  }
>  
>  /* Returns 0 on success, -errno on denial. */
> @@ -241,7 +261,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>  	    gid_eq(cred->gid, tcred->sgid) &&
>  	    gid_eq(cred->gid, tcred->gid))
>  		goto ok;
> -	if (ptrace_has_cap(tcred->user_ns, mode))
> +	if (ptrace_has_cap(tcred, mode))
>  		goto ok;
>  	rcu_read_unlock();
>  	return -EPERM;
> @@ -252,7 +272,7 @@ ok:
>  		dumpable = get_dumpable(task->mm);
>  	rcu_read_lock();
>  	if (dumpable != SUID_DUMP_USER &&
> -	    !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> +	    !ptrace_has_cap(__task_cred(task), mode)) {
>  		rcu_read_unlock();
>  		return -EPERM;
>  	}
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-12 20:12 [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Jann Horn
  2015-12-15  0:26 ` Andy Lutomirski
  2015-12-17 18:59 ` Serge E. Hallyn
@ 2015-12-26  1:10 ` Jann Horn
  2015-12-26  2:52   ` Jann Horn
  2015-12-26 20:23   ` Serge E. Hallyn
  2 siblings, 2 replies; 17+ messages in thread
From: Jann Horn @ 2015-12-26  1:10 UTC (permalink / raw)
  To: Roland McGrath, Oleg Nesterov
  Cc: linux-kernel, security, Serge Hallyn, Andy Lutomirski, Eric W. Biederman

[-- Attachment #1: Type: text/plain, Size: 2074 bytes --]

On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> With this change, the entering process can first enter the
> namespace and then safely inspect the namespace's
> properties, e.g. through /proc/self/{uid_map,gid_map},
> assuming that the namespace owner doesn't have access to
> uid 0.

Actually, I think I missed something there. Well, at least it
should not directly lead to a container escape.


> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
>  {
> +	struct user_namespace *tns = tcred->user_ns;
> +	struct user_namespace *curns = current_cred()->user_ns;
> +
> +	/* When a root-owned process enters a user namespace created by a
> +	 * malicious user, the user shouldn't be able to execute code under
> +	 * uid 0 by attaching to the root-owned process via ptrace.
> +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> +	 * verify that all the uids and gids of the target process are
> +	 * mapped into the current namespace.
> +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> +	 * either.
> +	 */
> +	if (!kuid_has_mapping(curns, tcred->euid) ||
> +			!kuid_has_mapping(curns, tcred->suid) ||
> +			!kuid_has_mapping(curns, tcred->uid)  ||
> +			!kgid_has_mapping(curns, tcred->egid) ||
> +			!kgid_has_mapping(curns, tcred->sgid) ||
> +			!kgid_has_mapping(curns, tcred->gid))
> +		return false;
> +
>  	if (mode & PTRACE_MODE_NOAUDIT)
> -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
>  	else
> -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
>  }

If the namespace owner can run code in the init namespace, the kuids are
mapped into curns but he is still capable wrt the target namespace.

I think a proper fix should first determine the highest parent of
tcred->user_ns in which the caller still has privs, then do the
kxid_has_mapping() checks in there.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26  1:10 ` Jann Horn
@ 2015-12-26  2:52   ` Jann Horn
  2015-12-26 21:17     ` Serge E. Hallyn
  2015-12-26 21:51     ` Serge E. Hallyn
  2015-12-26 20:23   ` Serge E. Hallyn
  1 sibling, 2 replies; 17+ messages in thread
From: Jann Horn @ 2015-12-26  2:52 UTC (permalink / raw)
  To: Roland McGrath, Oleg Nesterov
  Cc: linux-kernel, security, Serge Hallyn, Andy Lutomirski, Eric W. Biederman

ptrace_has_cap() checks whether the current process should be
treated as having a certain capability for ptrace checks
against another process. Until now, this was equivalent to
has_ns_capability(current, target_ns, CAP_SYS_PTRACE).

However, if a root-owned process wants to enter a user
namespace for some reason without knowing who owns it and
therefore can't change to the namespace owner's uid and gid
before entering, as soon as it has entered the namespace,
the namespace owner can attach to it via ptrace and thereby
gain access to its uid and gid.

While it is possible for the entering process to switch to
the uid of a claimed namespace owner before entering,
causing the attempt to enter to fail if the claimed uid is
wrong, this doesn't solve the problem of determining an
appropriate gid.

With this change, the entering process can first enter the
namespace and then safely inspect the namespace's
properties, e.g. through /proc/self/{uid_map,gid_map},
assuming that the namespace owner doesn't have access to
uid 0.

Changed in v2: The caller needs to be capable in the
namespace into which tcred's uids/gids can be mapped.

Signed-off-by: Jann Horn <jann@thejh.net>
---
 kernel/ptrace.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index b760bae..260a08d 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -20,6 +20,7 @@
 #include <linux/uio.h>
 #include <linux/audit.h>
 #include <linux/pid_namespace.h>
+#include <linux/user_namespace.h>
 #include <linux/syscalls.h>
 #include <linux/uaccess.h>
 #include <linux/regset.h>
@@ -207,12 +208,34 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 	return ret;
 }
 
-static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
+static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
 {
+	struct user_namespace *tns = tcred->user_ns;
+
+	/* When a root-owned process enters a user namespace created by a
+	 * malicious user, the user shouldn't be able to execute code under
+	 * uid 0 by attaching to the root-owned process via ptrace.
+	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
+	 * verify that all the uids and gids of the target process are
+	 * mapped into a namespace below the current one in which the caller
+	 * is capable.
+	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
+	 * either.
+	 */
+	while (
+	    !kuid_has_mapping(tns, tcred->euid) ||
+	    !kuid_has_mapping(tns, tcred->suid) ||
+	    !kuid_has_mapping(tns, tcred->uid)  ||
+	    !kgid_has_mapping(tns, tcred->egid) ||
+	    !kgid_has_mapping(tns, tcred->sgid) ||
+	    !kgid_has_mapping(tns, tcred->gid)) {
+		tns = tns->parent;
+	}
+
 	if (mode & PTRACE_MODE_NOAUDIT)
-		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
+		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
 	else
-		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
+		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
 }
 
 /* Returns 0 on success, -errno on denial. */
@@ -241,7 +264,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 	    gid_eq(cred->gid, tcred->sgid) &&
 	    gid_eq(cred->gid, tcred->gid))
 		goto ok;
-	if (ptrace_has_cap(tcred->user_ns, mode))
+	if (ptrace_has_cap(tcred, mode))
 		goto ok;
 	rcu_read_unlock();
 	return -EPERM;
@@ -252,7 +275,7 @@ ok:
 		dumpable = get_dumpable(task->mm);
 	rcu_read_lock();
 	if (dumpable != SUID_DUMP_USER &&
-	    !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
+	    !ptrace_has_cap(__task_cred(task), mode)) {
 		rcu_read_unlock();
 		return -EPERM;
 	}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26  1:10 ` Jann Horn
  2015-12-26  2:52   ` Jann Horn
@ 2015-12-26 20:23   ` Serge E. Hallyn
  2015-12-26 20:55     ` Jann Horn
  1 sibling, 1 reply; 17+ messages in thread
From: Serge E. Hallyn @ 2015-12-26 20:23 UTC (permalink / raw)
  To: Jann Horn
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security,
	Serge Hallyn, Andy Lutomirski, Eric W. Biederman

On Sat, Dec 26, 2015 at 02:10:38AM +0100, Jann Horn wrote:
> On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> > With this change, the entering process can first enter the
> > namespace and then safely inspect the namespace's
> > properties, e.g. through /proc/self/{uid_map,gid_map},
> > assuming that the namespace owner doesn't have access to
> > uid 0.
> 
> Actually, I think I missed something there. Well, at least it
> should not directly lead to a container escape.
> 
> 
> > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
> >  {
> > +	struct user_namespace *tns = tcred->user_ns;
> > +	struct user_namespace *curns = current_cred()->user_ns;
> > +
> > +	/* When a root-owned process enters a user namespace created by a
> > +	 * malicious user, the user shouldn't be able to execute code under
> > +	 * uid 0 by attaching to the root-owned process via ptrace.
> > +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> > +	 * verify that all the uids and gids of the target process are
> > +	 * mapped into the current namespace.
> > +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> > +	 * either.
> > +	 */
> > +	if (!kuid_has_mapping(curns, tcred->euid) ||
> > +			!kuid_has_mapping(curns, tcred->suid) ||
> > +			!kuid_has_mapping(curns, tcred->uid)  ||
> > +			!kgid_has_mapping(curns, tcred->egid) ||
> > +			!kgid_has_mapping(curns, tcred->sgid) ||
> > +			!kgid_has_mapping(curns, tcred->gid))
> > +		return false;
> > +
> >  	if (mode & PTRACE_MODE_NOAUDIT)
> > -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> > +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
> >  	else
> > -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> > +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
> >  }
> 
> If the namespace owner can run code in the init namespace, the kuids are
> mapped into curns but he is still capable wrt the target namespace.
> 
> I think a proper fix should first determine the highest parent of
> tcred->user_ns in which the caller still has privs, then do the
> kxid_has_mapping() checks in there.

Hi,

I don't quite follow what you are concerned about.  Based on the new
patch you sent, I assume it's not the case where the tcred's kuid is
actually mapped into the container.  So is it the case where I
unshare a userns which unshares a userns, then setns from the grandparent
into the child?  And if so, the concern is that if the setns()ing task's
kuid is mappable all along into the grandhild, then container root should
be able to ptrace it?

thanks,
-serge

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26 20:23   ` Serge E. Hallyn
@ 2015-12-26 20:55     ` Jann Horn
  2015-12-26 21:08       ` Serge E. Hallyn
  0 siblings, 1 reply; 17+ messages in thread
From: Jann Horn @ 2015-12-26 20:55 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security,
	Serge Hallyn, Andy Lutomirski, Eric W. Biederman

[-- Attachment #1: Type: text/plain, Size: 4625 bytes --]

On Sat, Dec 26, 2015 at 02:23:45PM -0600, Serge E. Hallyn wrote:
> On Sat, Dec 26, 2015 at 02:10:38AM +0100, Jann Horn wrote:
> > On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> > > With this change, the entering process can first enter the
> > > namespace and then safely inspect the namespace's
> > > properties, e.g. through /proc/self/{uid_map,gid_map},
> > > assuming that the namespace owner doesn't have access to
> > > uid 0.
> > 
> > Actually, I think I missed something there. Well, at least it
> > should not directly lead to a container escape.
> > 
> > 
> > > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> > > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
> > >  {
> > > +	struct user_namespace *tns = tcred->user_ns;
> > > +	struct user_namespace *curns = current_cred()->user_ns;
> > > +
> > > +	/* When a root-owned process enters a user namespace created by a
> > > +	 * malicious user, the user shouldn't be able to execute code under
> > > +	 * uid 0 by attaching to the root-owned process via ptrace.
> > > +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> > > +	 * verify that all the uids and gids of the target process are
> > > +	 * mapped into the current namespace.
> > > +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> > > +	 * either.
> > > +	 */
> > > +	if (!kuid_has_mapping(curns, tcred->euid) ||
> > > +			!kuid_has_mapping(curns, tcred->suid) ||
> > > +			!kuid_has_mapping(curns, tcred->uid)  ||
> > > +			!kgid_has_mapping(curns, tcred->egid) ||
> > > +			!kgid_has_mapping(curns, tcred->sgid) ||
> > > +			!kgid_has_mapping(curns, tcred->gid))
> > > +		return false;
> > > +
> > >  	if (mode & PTRACE_MODE_NOAUDIT)
> > > -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> > > +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
> > >  	else
> > > -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> > > +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
> > >  }
> > 
> > If the namespace owner can run code in the init namespace, the kuids are
> > mapped into curns but he is still capable wrt the target namespace.
> > 
> > I think a proper fix should first determine the highest parent of
> > tcred->user_ns in which the caller still has privs, then do the
> > kxid_has_mapping() checks in there.
> 
> Hi,
> 
> I don't quite follow what you are concerned about.  Based on the new
> patch you sent, I assume it's not the case where the tcred's kuid is
> actually mapped into the container.  So is it the case where I
> unshare a userns which unshares a userns, then setns from the grandparent
> into the child?  And if so, the concern is that if the setns()ing task's
> kuid is mappable all along into the grandhild, then container root should
> be able to ptrace it?

Consider the following scenario:

init_user_ns has a child namespace (I'll call it child_ns).
child_ns is owned by an attacker (child_ns->owner == attacker_kuid).
The attacking process has current_cred()->euid == attacker_kuid and lives
in init_user_ns (which means it's capable in child_ns).

The victim process (with euid==0) enters the namespace, then the attacking
process tries to attach to it. ptrace_has_cap(tcred, mode) would, with my
old patch, still be true because current is capable in tcred->user_ns and
all kuids are mapped into init_user_ns.

The new patch uses the following rule for cap checks:

    The caller is capable relative to a target if a user namespace exists
    for which all of the following conditions are true:

     - The caller has the requested capability inside the namespace.
     - The target is inside the namespace (either directly or in a child).
     - The target's kuids and kgids are mapped into the namespace.

The first rule is enforced by the has_ns_capability(..., tns, ...) or
has_ns_capability_noaudit(..., tns, ...) call at the bottom.

The second rule is implicitly true because tns initially is the target's
user namespace and then moves up through ->parent.

The third rule is enforced by the while loop.


This prevents the attack I described, but e.g. still allows someone who
is capable in init_user_ns to ptrace anything, no matter in what weird
namespace the target is - if a task was ptrace-able for a process before
it called clone(CLONE_NEWUSER) / unshare(CLONE_NEWUSER) /
setns(, CLONE_NEWUSER), it should still be ptrace-able afterwards.
(Unless access was permitted based on the introspection rule.)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26 20:55     ` Jann Horn
@ 2015-12-26 21:08       ` Serge E. Hallyn
  0 siblings, 0 replies; 17+ messages in thread
From: Serge E. Hallyn @ 2015-12-26 21:08 UTC (permalink / raw)
  To: Jann Horn
  Cc: Serge E. Hallyn, Roland McGrath, Oleg Nesterov, linux-kernel,
	security, Serge Hallyn, Andy Lutomirski, Eric W. Biederman

On Sat, Dec 26, 2015 at 09:55:50PM +0100, Jann Horn wrote:
> On Sat, Dec 26, 2015 at 02:23:45PM -0600, Serge E. Hallyn wrote:
> > On Sat, Dec 26, 2015 at 02:10:38AM +0100, Jann Horn wrote:
> > > On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> > > > With this change, the entering process can first enter the
> > > > namespace and then safely inspect the namespace's
> > > > properties, e.g. through /proc/self/{uid_map,gid_map},
> > > > assuming that the namespace owner doesn't have access to
> > > > uid 0.
> > > 
> > > Actually, I think I missed something there. Well, at least it
> > > should not directly lead to a container escape.
> > > 
> > > 
> > > > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> > > > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
> > > >  {
> > > > +	struct user_namespace *tns = tcred->user_ns;
> > > > +	struct user_namespace *curns = current_cred()->user_ns;
> > > > +
> > > > +	/* When a root-owned process enters a user namespace created by a
> > > > +	 * malicious user, the user shouldn't be able to execute code under
> > > > +	 * uid 0 by attaching to the root-owned process via ptrace.
> > > > +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> > > > +	 * verify that all the uids and gids of the target process are
> > > > +	 * mapped into the current namespace.
> > > > +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> > > > +	 * either.
> > > > +	 */
> > > > +	if (!kuid_has_mapping(curns, tcred->euid) ||
> > > > +			!kuid_has_mapping(curns, tcred->suid) ||
> > > > +			!kuid_has_mapping(curns, tcred->uid)  ||
> > > > +			!kgid_has_mapping(curns, tcred->egid) ||
> > > > +			!kgid_has_mapping(curns, tcred->sgid) ||
> > > > +			!kgid_has_mapping(curns, tcred->gid))
> > > > +		return false;
> > > > +
> > > >  	if (mode & PTRACE_MODE_NOAUDIT)
> > > > -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> > > > +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
> > > >  	else
> > > > -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> > > > +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
> > > >  }
> > > 
> > > If the namespace owner can run code in the init namespace, the kuids are
> > > mapped into curns but he is still capable wrt the target namespace.
> > > 
> > > I think a proper fix should first determine the highest parent of
> > > tcred->user_ns in which the caller still has privs, then do the
> > > kxid_has_mapping() checks in there.
> > 
> > Hi,
> > 
> > I don't quite follow what you are concerned about.  Based on the new
> > patch you sent, I assume it's not the case where the tcred's kuid is
> > actually mapped into the container.  So is it the case where I
> > unshare a userns which unshares a userns, then setns from the grandparent
> > into the child?  And if so, the concern is that if the setns()ing task's
> > kuid is mappable all along into the grandhild, then container root should
> > be able to ptrace it?
> 
> Consider the following scenario:
> 
> init_user_ns has a child namespace (I'll call it child_ns).
> child_ns is owned by an attacker (child_ns->owner == attacker_kuid).
> The attacking process has current_cred()->euid == attacker_kuid and lives
> in init_user_ns (which means it's capable in child_ns).

Ah, right.  Special.

Thanks.

-serge

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26  2:52   ` Jann Horn
@ 2015-12-26 21:17     ` Serge E. Hallyn
  2015-12-26 21:27       ` Jann Horn
  2015-12-26 21:51     ` Serge E. Hallyn
  1 sibling, 1 reply; 17+ messages in thread
From: Serge E. Hallyn @ 2015-12-26 21:17 UTC (permalink / raw)
  To: Jann Horn
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security,
	Serge Hallyn, Andy Lutomirski, Eric W. Biederman

On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
> ptrace_has_cap() checks whether the current process should be
> treated as having a certain capability for ptrace checks
> against another process. Until now, this was equivalent to
> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
> 
> However, if a root-owned process wants to enter a user
> namespace for some reason without knowing who owns it and
> therefore can't change to the namespace owner's uid and gid
> before entering, as soon as it has entered the namespace,
> the namespace owner can attach to it via ptrace and thereby
> gain access to its uid and gid.
> 
> While it is possible for the entering process to switch to
> the uid of a claimed namespace owner before entering,
> causing the attempt to enter to fail if the claimed uid is
> wrong, this doesn't solve the problem of determining an
> appropriate gid.
> 
> With this change, the entering process can first enter the
> namespace and then safely inspect the namespace's
> properties, e.g. through /proc/self/{uid_map,gid_map},
> assuming that the namespace owner doesn't have access to
> uid 0.
> 
> Changed in v2: The caller needs to be capable in the
> namespace into which tcred's uids/gids can be mapped.
> 
> Signed-off-by: Jann Horn <jann@thejh.net>
> ---
>  kernel/ptrace.c | 33 ++++++++++++++++++++++++++++-----
>  1 file changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index b760bae..260a08d 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -20,6 +20,7 @@
>  #include <linux/uio.h>
>  #include <linux/audit.h>
>  #include <linux/pid_namespace.h>
> +#include <linux/user_namespace.h>
>  #include <linux/syscalls.h>
>  #include <linux/uaccess.h>
>  #include <linux/regset.h>
> @@ -207,12 +208,34 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
>  	return ret;
>  }
>  
> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
>  {
> +	struct user_namespace *tns = tcred->user_ns;
> +
> +	/* When a root-owned process enters a user namespace created by a
> +	 * malicious user, the user shouldn't be able to execute code under
> +	 * uid 0 by attaching to the root-owned process via ptrace.
> +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> +	 * verify that all the uids and gids of the target process are
> +	 * mapped into a namespace below the current one in which the caller
> +	 * is capable.
> +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> +	 * either.
> +	 */
> +	while (
> +	    !kuid_has_mapping(tns, tcred->euid) ||
> +	    !kuid_has_mapping(tns, tcred->suid) ||
> +	    !kuid_has_mapping(tns, tcred->uid)  ||
> +	    !kgid_has_mapping(tns, tcred->egid) ||
> +	    !kgid_has_mapping(tns, tcred->sgid) ||
> +	    !kgid_has_mapping(tns, tcred->gid)) {
> +		tns = tns->parent;

Sorry, i can't quite remember - is there a way for a task in init_user_ns to have
INVALID_UID | INVALID_GID ?  I.e. any point in breaking here if tns == &init_user_n?

> +	}
> +
>  	if (mode & PTRACE_MODE_NOAUDIT)
> -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
>  	else
> -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
>  }
>  
>  /* Returns 0 on success, -errno on denial. */
> @@ -241,7 +264,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>  	    gid_eq(cred->gid, tcred->sgid) &&
>  	    gid_eq(cred->gid, tcred->gid))
>  		goto ok;
> -	if (ptrace_has_cap(tcred->user_ns, mode))
> +	if (ptrace_has_cap(tcred, mode))
>  		goto ok;
>  	rcu_read_unlock();
>  	return -EPERM;
> @@ -252,7 +275,7 @@ ok:
>  		dumpable = get_dumpable(task->mm);
>  	rcu_read_lock();
>  	if (dumpable != SUID_DUMP_USER &&
> -	    !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> +	    !ptrace_has_cap(__task_cred(task), mode)) {
>  		rcu_read_unlock();
>  		return -EPERM;
>  	}
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26 21:17     ` Serge E. Hallyn
@ 2015-12-26 21:27       ` Jann Horn
  2015-12-26 21:49         ` Serge E. Hallyn
  0 siblings, 1 reply; 17+ messages in thread
From: Jann Horn @ 2015-12-26 21:27 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security,
	Serge Hallyn, Andy Lutomirski, Eric W. Biederman

[-- Attachment #1: Type: text/plain, Size: 5067 bytes --]

On Sat, Dec 26, 2015 at 03:17:29PM -0600, Serge E. Hallyn wrote:
> On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
> > ptrace_has_cap() checks whether the current process should be
> > treated as having a certain capability for ptrace checks
> > against another process. Until now, this was equivalent to
> > has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
> > 
> > However, if a root-owned process wants to enter a user
> > namespace for some reason without knowing who owns it and
> > therefore can't change to the namespace owner's uid and gid
> > before entering, as soon as it has entered the namespace,
> > the namespace owner can attach to it via ptrace and thereby
> > gain access to its uid and gid.
> > 
> > While it is possible for the entering process to switch to
> > the uid of a claimed namespace owner before entering,
> > causing the attempt to enter to fail if the claimed uid is
> > wrong, this doesn't solve the problem of determining an
> > appropriate gid.
> > 
> > With this change, the entering process can first enter the
> > namespace and then safely inspect the namespace's
> > properties, e.g. through /proc/self/{uid_map,gid_map},
> > assuming that the namespace owner doesn't have access to
> > uid 0.
> > 
> > Changed in v2: The caller needs to be capable in the
> > namespace into which tcred's uids/gids can be mapped.
> > 
> > Signed-off-by: Jann Horn <jann@thejh.net>
> > ---
> >  kernel/ptrace.c | 33 ++++++++++++++++++++++++++++-----
> >  1 file changed, 28 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> > index b760bae..260a08d 100644
> > --- a/kernel/ptrace.c
> > +++ b/kernel/ptrace.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/uio.h>
> >  #include <linux/audit.h>
> >  #include <linux/pid_namespace.h>
> > +#include <linux/user_namespace.h>
> >  #include <linux/syscalls.h>
> >  #include <linux/uaccess.h>
> >  #include <linux/regset.h>
> > @@ -207,12 +208,34 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
> >  	return ret;
> >  }
> >  
> > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
> >  {
> > +	struct user_namespace *tns = tcred->user_ns;
> > +
> > +	/* When a root-owned process enters a user namespace created by a
> > +	 * malicious user, the user shouldn't be able to execute code under
> > +	 * uid 0 by attaching to the root-owned process via ptrace.
> > +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> > +	 * verify that all the uids and gids of the target process are
> > +	 * mapped into a namespace below the current one in which the caller
> > +	 * is capable.
> > +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> > +	 * either.
> > +	 */
> > +	while (
> > +	    !kuid_has_mapping(tns, tcred->euid) ||
> > +	    !kuid_has_mapping(tns, tcred->suid) ||
> > +	    !kuid_has_mapping(tns, tcred->uid)  ||
> > +	    !kgid_has_mapping(tns, tcred->egid) ||
> > +	    !kgid_has_mapping(tns, tcred->sgid) ||
> > +	    !kgid_has_mapping(tns, tcred->gid)) {
> > +		tns = tns->parent;
> 
> Sorry, i can't quite remember - is there a way for a task in init_user_ns to have
> INVALID_UID | INVALID_GID ?  I.e. any point in breaking here if tns == &init_user_n?

I assumed that there isn't because the comment above the definition of from_kuid()
says so. Checking... the syscalls for setting uid/gid seem to enforce that uid/gid
aren't -1, and setuid/setgid executables require the uid/gid to be mapped. So it
seems to be true.


> > +	}
> > +
> >  	if (mode & PTRACE_MODE_NOAUDIT)
> > -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> > +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
> >  	else
> > -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> > +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
> >  }
> >  
> >  /* Returns 0 on success, -errno on denial. */
> > @@ -241,7 +264,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
> >  	    gid_eq(cred->gid, tcred->sgid) &&
> >  	    gid_eq(cred->gid, tcred->gid))
> >  		goto ok;
> > -	if (ptrace_has_cap(tcred->user_ns, mode))
> > +	if (ptrace_has_cap(tcred, mode))
> >  		goto ok;
> >  	rcu_read_unlock();
> >  	return -EPERM;
> > @@ -252,7 +275,7 @@ ok:
> >  		dumpable = get_dumpable(task->mm);
> >  	rcu_read_lock();
> >  	if (dumpable != SUID_DUMP_USER &&
> > -	    !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> > +	    !ptrace_has_cap(__task_cred(task), mode)) {
> >  		rcu_read_unlock();
> >  		return -EPERM;
> >  	}
> > -- 
> > 2.1.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26 21:27       ` Jann Horn
@ 2015-12-26 21:49         ` Serge E. Hallyn
  0 siblings, 0 replies; 17+ messages in thread
From: Serge E. Hallyn @ 2015-12-26 21:49 UTC (permalink / raw)
  To: Jann Horn
  Cc: Serge E. Hallyn, Roland McGrath, Oleg Nesterov, linux-kernel,
	security, Serge Hallyn, Andy Lutomirski, Eric W. Biederman

On Sat, Dec 26, 2015 at 10:27:33PM +0100, Jann Horn wrote:
> On Sat, Dec 26, 2015 at 03:17:29PM -0600, Serge E. Hallyn wrote:
> > On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
> > > ptrace_has_cap() checks whether the current process should be
> > > treated as having a certain capability for ptrace checks
> > > against another process. Until now, this was equivalent to
> > > has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
> > > 
> > > However, if a root-owned process wants to enter a user
> > > namespace for some reason without knowing who owns it and
> > > therefore can't change to the namespace owner's uid and gid
> > > before entering, as soon as it has entered the namespace,
> > > the namespace owner can attach to it via ptrace and thereby
> > > gain access to its uid and gid.
> > > 
> > > While it is possible for the entering process to switch to
> > > the uid of a claimed namespace owner before entering,
> > > causing the attempt to enter to fail if the claimed uid is
> > > wrong, this doesn't solve the problem of determining an
> > > appropriate gid.
> > > 
> > > With this change, the entering process can first enter the
> > > namespace and then safely inspect the namespace's
> > > properties, e.g. through /proc/self/{uid_map,gid_map},
> > > assuming that the namespace owner doesn't have access to
> > > uid 0.
> > > 
> > > Changed in v2: The caller needs to be capable in the
> > > namespace into which tcred's uids/gids can be mapped.
> > > 
> > > Signed-off-by: Jann Horn <jann@thejh.net>
> > > ---
> > >  kernel/ptrace.c | 33 ++++++++++++++++++++++++++++-----
> > >  1 file changed, 28 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> > > index b760bae..260a08d 100644
> > > --- a/kernel/ptrace.c
> > > +++ b/kernel/ptrace.c
> > > @@ -20,6 +20,7 @@
> > >  #include <linux/uio.h>
> > >  #include <linux/audit.h>
> > >  #include <linux/pid_namespace.h>
> > > +#include <linux/user_namespace.h>
> > >  #include <linux/syscalls.h>
> > >  #include <linux/uaccess.h>
> > >  #include <linux/regset.h>
> > > @@ -207,12 +208,34 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
> > >  	return ret;
> > >  }
> > >  
> > > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> > > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
> > >  {
> > > +	struct user_namespace *tns = tcred->user_ns;
> > > +
> > > +	/* When a root-owned process enters a user namespace created by a
> > > +	 * malicious user, the user shouldn't be able to execute code under
> > > +	 * uid 0 by attaching to the root-owned process via ptrace.
> > > +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> > > +	 * verify that all the uids and gids of the target process are
> > > +	 * mapped into a namespace below the current one in which the caller
> > > +	 * is capable.
> > > +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> > > +	 * either.
> > > +	 */
> > > +	while (
> > > +	    !kuid_has_mapping(tns, tcred->euid) ||
> > > +	    !kuid_has_mapping(tns, tcred->suid) ||
> > > +	    !kuid_has_mapping(tns, tcred->uid)  ||
> > > +	    !kgid_has_mapping(tns, tcred->egid) ||
> > > +	    !kgid_has_mapping(tns, tcred->sgid) ||
> > > +	    !kgid_has_mapping(tns, tcred->gid)) {
> > > +		tns = tns->parent;
> > 
> > Sorry, i can't quite remember - is there a way for a task in init_user_ns to have
> > INVALID_UID | INVALID_GID ?  I.e. any point in breaking here if tns == &init_user_n?
> 
> I assumed that there isn't because the comment above the definition of from_kuid()
> says so. Checking... the syscalls for setting uid/gid seem to enforce that uid/gid
> aren't -1, and setuid/setgid executables require the uid/gid to be mapped. So it
> seems to be true.

Yeah, I knew I'd read it somewhere but couldn't find the comment.  Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26  2:52   ` Jann Horn
  2015-12-26 21:17     ` Serge E. Hallyn
@ 2015-12-26 21:51     ` Serge E. Hallyn
  2015-12-27  2:03       ` Andy Lutomirski
  1 sibling, 1 reply; 17+ messages in thread
From: Serge E. Hallyn @ 2015-12-26 21:51 UTC (permalink / raw)
  To: Jann Horn
  Cc: Roland McGrath, Oleg Nesterov, linux-kernel, security,
	Serge Hallyn, Andy Lutomirski, Eric W. Biederman

On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
> ptrace_has_cap() checks whether the current process should be
> treated as having a certain capability for ptrace checks
> against another process. Until now, this was equivalent to
> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
> 
> However, if a root-owned process wants to enter a user
> namespace for some reason without knowing who owns it and
> therefore can't change to the namespace owner's uid and gid
> before entering, as soon as it has entered the namespace,
> the namespace owner can attach to it via ptrace and thereby
> gain access to its uid and gid.
> 
> While it is possible for the entering process to switch to
> the uid of a claimed namespace owner before entering,
> causing the attempt to enter to fail if the claimed uid is
> wrong, this doesn't solve the problem of determining an
> appropriate gid.
> 
> With this change, the entering process can first enter the
> namespace and then safely inspect the namespace's
> properties, e.g. through /proc/self/{uid_map,gid_map},
> assuming that the namespace owner doesn't have access to
> uid 0.
> 
> Changed in v2: The caller needs to be capable in the
> namespace into which tcred's uids/gids can be mapped.
> 
> Signed-off-by: Jann Horn <jann@thejh.net>

Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>

> ---
>  kernel/ptrace.c | 33 ++++++++++++++++++++++++++++-----
>  1 file changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index b760bae..260a08d 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -20,6 +20,7 @@
>  #include <linux/uio.h>
>  #include <linux/audit.h>
>  #include <linux/pid_namespace.h>
> +#include <linux/user_namespace.h>
>  #include <linux/syscalls.h>
>  #include <linux/uaccess.h>
>  #include <linux/regset.h>
> @@ -207,12 +208,34 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
>  	return ret;
>  }
>  
> -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
>  {
> +	struct user_namespace *tns = tcred->user_ns;
> +
> +	/* When a root-owned process enters a user namespace created by a
> +	 * malicious user, the user shouldn't be able to execute code under
> +	 * uid 0 by attaching to the root-owned process via ptrace.
> +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> +	 * verify that all the uids and gids of the target process are
> +	 * mapped into a namespace below the current one in which the caller
> +	 * is capable.
> +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> +	 * either.
> +	 */
> +	while (
> +	    !kuid_has_mapping(tns, tcred->euid) ||
> +	    !kuid_has_mapping(tns, tcred->suid) ||
> +	    !kuid_has_mapping(tns, tcred->uid)  ||
> +	    !kgid_has_mapping(tns, tcred->egid) ||
> +	    !kgid_has_mapping(tns, tcred->sgid) ||
> +	    !kgid_has_mapping(tns, tcred->gid)) {
> +		tns = tns->parent;
> +	}
> +
>  	if (mode & PTRACE_MODE_NOAUDIT)
> -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
>  	else
> -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
>  }
>  
>  /* Returns 0 on success, -errno on denial. */
> @@ -241,7 +264,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>  	    gid_eq(cred->gid, tcred->sgid) &&
>  	    gid_eq(cred->gid, tcred->gid))
>  		goto ok;
> -	if (ptrace_has_cap(tcred->user_ns, mode))
> +	if (ptrace_has_cap(tcred, mode))
>  		goto ok;
>  	rcu_read_unlock();
>  	return -EPERM;
> @@ -252,7 +275,7 @@ ok:
>  		dumpable = get_dumpable(task->mm);
>  	rcu_read_lock();
>  	if (dumpable != SUID_DUMP_USER &&
> -	    !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> +	    !ptrace_has_cap(__task_cred(task), mode)) {
>  		rcu_read_unlock();
>  		return -EPERM;
>  	}
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-26 21:51     ` Serge E. Hallyn
@ 2015-12-27  2:03       ` Andy Lutomirski
  2016-01-04 15:03         ` Josh Boyer
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2015-12-27  2:03 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Jann Horn, Roland McGrath, Oleg Nesterov, linux-kernel, security,
	Serge Hallyn, Eric W. Biederman

On Sat, Dec 26, 2015 at 1:51 PM, Serge E. Hallyn
<serge.hallyn@ubuntu.com> wrote:
> On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
>> ptrace_has_cap() checks whether the current process should be
>> treated as having a certain capability for ptrace checks
>> against another process. Until now, this was equivalent to
>> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
>>
>> However, if a root-owned process wants to enter a user
>> namespace for some reason without knowing who owns it and
>> therefore can't change to the namespace owner's uid and gid
>> before entering, as soon as it has entered the namespace,
>> the namespace owner can attach to it via ptrace and thereby
>> gain access to its uid and gid.
>>
>> While it is possible for the entering process to switch to
>> the uid of a claimed namespace owner before entering,
>> causing the attempt to enter to fail if the claimed uid is
>> wrong, this doesn't solve the problem of determining an
>> appropriate gid.
>>
>> With this change, the entering process can first enter the
>> namespace and then safely inspect the namespace's
>> properties, e.g. through /proc/self/{uid_map,gid_map},
>> assuming that the namespace owner doesn't have access to
>> uid 0.
>>
>> Changed in v2: The caller needs to be capable in the
>> namespace into which tcred's uids/gids can be mapped.
>>
>> Signed-off-by: Jann Horn <jann@thejh.net>
>
> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>

Acked-by: Andy Lutomirski <luto@kernel.org>

Who's going to apply this?  Linus?  Eric?

--Andy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2015-12-27  2:03       ` Andy Lutomirski
@ 2016-01-04 15:03         ` Josh Boyer
  2016-01-06  1:17           ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Boyer @ 2016-01-04 15:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Serge E. Hallyn, Jann Horn, Roland McGrath, Oleg Nesterov,
	linux-kernel, security, Serge Hallyn, Eric W. Biederman,
	Linus Torvalds

On Sat, Dec 26, 2015 at 9:03 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Sat, Dec 26, 2015 at 1:51 PM, Serge E. Hallyn
> <serge.hallyn@ubuntu.com> wrote:
>> On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
>>> ptrace_has_cap() checks whether the current process should be
>>> treated as having a certain capability for ptrace checks
>>> against another process. Until now, this was equivalent to
>>> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
>>>
>>> However, if a root-owned process wants to enter a user
>>> namespace for some reason without knowing who owns it and
>>> therefore can't change to the namespace owner's uid and gid
>>> before entering, as soon as it has entered the namespace,
>>> the namespace owner can attach to it via ptrace and thereby
>>> gain access to its uid and gid.
>>>
>>> While it is possible for the entering process to switch to
>>> the uid of a claimed namespace owner before entering,
>>> causing the attempt to enter to fail if the claimed uid is
>>> wrong, this doesn't solve the problem of determining an
>>> appropriate gid.
>>>
>>> With this change, the entering process can first enter the
>>> namespace and then safely inspect the namespace's
>>> properties, e.g. through /proc/self/{uid_map,gid_map},
>>> assuming that the namespace owner doesn't have access to
>>> uid 0.
>>>
>>> Changed in v2: The caller needs to be capable in the
>>> namespace into which tcred's uids/gids can be mapped.
>>>
>>> Signed-off-by: Jann Horn <jann@thejh.net>
>>
>> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
>
> Acked-by: Andy Lutomirski <luto@kernel.org>
>
> Who's going to apply this?  Linus?  Eric?

An Ack from Oleg would be nice too.  I'm guessing this got lost in the
holidays but it has an assigned CVE now.  Would be good to get it in
4.4 final.

josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2016-01-04 15:03         ` Josh Boyer
@ 2016-01-06  1:17           ` Eric W. Biederman
  2016-01-06  1:36             ` Andy Lutomirski
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2016-01-06  1:17 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Andy Lutomirski, Serge E. Hallyn, Jann Horn, Roland McGrath,
	Oleg Nesterov, linux-kernel, security, Serge Hallyn,
	Linus Torvalds

Josh Boyer <jwboyer@fedoraproject.org> writes:

> On Sat, Dec 26, 2015 at 9:03 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Sat, Dec 26, 2015 at 1:51 PM, Serge E. Hallyn
>> <serge.hallyn@ubuntu.com> wrote:
>>> On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
>>>> ptrace_has_cap() checks whether the current process should be
>>>> treated as having a certain capability for ptrace checks
>>>> against another process. Until now, this was equivalent to
>>>> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
>>>>
>>>> However, if a root-owned process wants to enter a user
>>>> namespace for some reason without knowing who owns it and
>>>> therefore can't change to the namespace owner's uid and gid
>>>> before entering, as soon as it has entered the namespace,
>>>> the namespace owner can attach to it via ptrace and thereby
>>>> gain access to its uid and gid.
>>>>
>>>> While it is possible for the entering process to switch to
>>>> the uid of a claimed namespace owner before entering,
>>>> causing the attempt to enter to fail if the claimed uid is
>>>> wrong, this doesn't solve the problem of determining an
>>>> appropriate gid.
>>>>
>>>> With this change, the entering process can first enter the
>>>> namespace and then safely inspect the namespace's
>>>> properties, e.g. through /proc/self/{uid_map,gid_map},
>>>> assuming that the namespace owner doesn't have access to
>>>> uid 0.
>>>>
>>>> Changed in v2: The caller needs to be capable in the
>>>> namespace into which tcred's uids/gids can be mapped.
>>>>
>>>> Signed-off-by: Jann Horn <jann@thejh.net>
>>>
>>> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
>>
>> Acked-by: Andy Lutomirski <luto@kernel.org>
>>
>> Who's going to apply this?  Linus?  Eric?
>
> An Ack from Oleg would be nice too.  I'm guessing this got lost in the
> holidays but it has an assigned CVE now.  Would be good to get it in
> 4.4 final.

If people are going to go around and refuse to understand the problem
and assign CVEs to the kernel when they can't understand what is
necessary to safely write code I am inclined to nack the entire mess.

Whatever (if anything) that is calling setns in this problematic way is
the problem today.

This thread is about a feature request to make it easier to write secure
code not about a vulnerability in user namespaces.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2016-01-06  1:17           ` Eric W. Biederman
@ 2016-01-06  1:36             ` Andy Lutomirski
  2016-01-06  2:04               ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2016-01-06  1:36 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Josh Boyer, Serge E. Hallyn, Jann Horn, Roland McGrath,
	Oleg Nesterov, linux-kernel, security, Serge Hallyn,
	Linus Torvalds

On Tue, Jan 5, 2016 at 5:17 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Josh Boyer <jwboyer@fedoraproject.org> writes:
>
>> On Sat, Dec 26, 2015 at 9:03 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Sat, Dec 26, 2015 at 1:51 PM, Serge E. Hallyn
>>> <serge.hallyn@ubuntu.com> wrote:
>>>> On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
>>>>> ptrace_has_cap() checks whether the current process should be
>>>>> treated as having a certain capability for ptrace checks
>>>>> against another process. Until now, this was equivalent to
>>>>> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
>>>>>
>>>>> However, if a root-owned process wants to enter a user
>>>>> namespace for some reason without knowing who owns it and
>>>>> therefore can't change to the namespace owner's uid and gid
>>>>> before entering, as soon as it has entered the namespace,
>>>>> the namespace owner can attach to it via ptrace and thereby
>>>>> gain access to its uid and gid.
>>>>>
>>>>> While it is possible for the entering process to switch to
>>>>> the uid of a claimed namespace owner before entering,
>>>>> causing the attempt to enter to fail if the claimed uid is
>>>>> wrong, this doesn't solve the problem of determining an
>>>>> appropriate gid.
>>>>>
>>>>> With this change, the entering process can first enter the
>>>>> namespace and then safely inspect the namespace's
>>>>> properties, e.g. through /proc/self/{uid_map,gid_map},
>>>>> assuming that the namespace owner doesn't have access to
>>>>> uid 0.
>>>>>
>>>>> Changed in v2: The caller needs to be capable in the
>>>>> namespace into which tcred's uids/gids can be mapped.
>>>>>
>>>>> Signed-off-by: Jann Horn <jann@thejh.net>
>>>>
>>>> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
>>>
>>> Acked-by: Andy Lutomirski <luto@kernel.org>
>>>
>>> Who's going to apply this?  Linus?  Eric?
>>
>> An Ack from Oleg would be nice too.  I'm guessing this got lost in the
>> holidays but it has an assigned CVE now.  Would be good to get it in
>> 4.4 final.
>
> If people are going to go around and refuse to understand the problem
> and assign CVEs to the kernel when they can't understand what is
> necessary to safely write code I am inclined to nack the entire mess.
>
> Whatever (if anything) that is calling setns in this problematic way is
> the problem today.
>

Even if we were to grant that the setns caller is buggy, this patch
seems like a good hardening measure.  Is there any case where ptrace
access *should* be granted but would not be granted with this patch
applied?

> This thread is about a feature request to make it easier to write secure
> code not about a vulnerability in user namespaces.

So what?

--Andy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
  2016-01-06  1:36             ` Andy Lutomirski
@ 2016-01-06  2:04               ` Eric W. Biederman
  0 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2016-01-06  2:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Boyer, Serge E. Hallyn, Jann Horn, Roland McGrath,
	Oleg Nesterov, linux-kernel, security, Serge Hallyn,
	Linus Torvalds

Andy Lutomirski <luto@amacapital.net> writes:

> On Tue, Jan 5, 2016 at 5:17 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>> Josh Boyer <jwboyer@fedoraproject.org> writes:
>>
>>> On Sat, Dec 26, 2015 at 9:03 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Sat, Dec 26, 2015 at 1:51 PM, Serge E. Hallyn
>>>> <serge.hallyn@ubuntu.com> wrote:
>>>>> On Sat, Dec 26, 2015 at 03:52:31AM +0100, Jann Horn wrote:
>>>>>> ptrace_has_cap() checks whether the current process should be
>>>>>> treated as having a certain capability for ptrace checks
>>>>>> against another process. Until now, this was equivalent to
>>>>>> has_ns_capability(current, target_ns, CAP_SYS_PTRACE).
>>>>>>
>>>>>> However, if a root-owned process wants to enter a user
>>>>>> namespace for some reason without knowing who owns it and
>>>>>> therefore can't change to the namespace owner's uid and gid
>>>>>> before entering, as soon as it has entered the namespace,
>>>>>> the namespace owner can attach to it via ptrace and thereby
>>>>>> gain access to its uid and gid.
>>>>>>
>>>>>> While it is possible for the entering process to switch to
>>>>>> the uid of a claimed namespace owner before entering,
>>>>>> causing the attempt to enter to fail if the claimed uid is
>>>>>> wrong, this doesn't solve the problem of determining an
>>>>>> appropriate gid.
>>>>>>
>>>>>> With this change, the entering process can first enter the
>>>>>> namespace and then safely inspect the namespace's
>>>>>> properties, e.g. through /proc/self/{uid_map,gid_map},
>>>>>> assuming that the namespace owner doesn't have access to
>>>>>> uid 0.
>>>>>>
>>>>>> Changed in v2: The caller needs to be capable in the
>>>>>> namespace into which tcred's uids/gids can be mapped.
>>>>>>
>>>>>> Signed-off-by: Jann Horn <jann@thejh.net>
>>>>>
>>>>> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
>>>>
>>>> Acked-by: Andy Lutomirski <luto@kernel.org>
>>>>
>>>> Who's going to apply this?  Linus?  Eric?
>>>
>>> An Ack from Oleg would be nice too.  I'm guessing this got lost in the
>>> holidays but it has an assigned CVE now.  Would be good to get it in
>>> 4.4 final.
>>
>> If people are going to go around and refuse to understand the problem
>> and assign CVEs to the kernel when they can't understand what is
>> necessary to safely write code I am inclined to nack the entire mess.
>>
>> Whatever (if anything) that is calling setns in this problematic way is
>> the problem today.
>>
>
> Even if we were to grant that the setns caller is buggy, this patch
> seems like a good hardening measure.  Is there any case where ptrace
> access *should* be granted but would not be granted with this patch
> applied?

Honestly I rather like the direction of this patch.

This patch seems to be upholding the old princimple that you are not
allowed to trace a process that has security relevant resources you
could not access any other way.  In this case uids.

As for the question of does this break userspace.  The only preexisting
case that this even comes close to is ptracing user mode helpers.  All
of the existing ptrace hardening in yama actually blocks this case
because the of the disjoint process tree.  So I don't think we are going
to break userspace by changing behavior here.

I can see a strong argument for wanting ptrace to work from this process
to other processes in the user namespace.  Because setns/nsenter is what
you use when you want to debug what is wrong in a container.

That said I don't think it will ever be easy, to write safe code, that
places a process into a user namespace with a hostile user namespace
root.  We can certainly help by tweaking a few things but it won't be
easy.  In writing a process that cass setns and enters a potentially
hostile user namespace you have to be aware of what resource your
process is holding onto and have to carefully drop them.

In this case I am wondering if it might be smart to also consider
setting dumpable (or some version of dumpable) when we call setns and
enter a user namespace.

>> This thread is about a feature request to make it easier to write secure
>> code not about a vulnerability in user namespaces.
>
> So what?

Viewing this as a vulnerability in user namespaces is problematic
because it encourages people to write buggy code.  Plus I get people
asking me what is going on with CVExyz.  So I am answering them there is
no kernel bug here.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-01-06  2:13 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-12 20:12 [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Jann Horn
2015-12-15  0:26 ` Andy Lutomirski
2015-12-17 18:59 ` Serge E. Hallyn
2015-12-26  1:10 ` Jann Horn
2015-12-26  2:52   ` Jann Horn
2015-12-26 21:17     ` Serge E. Hallyn
2015-12-26 21:27       ` Jann Horn
2015-12-26 21:49         ` Serge E. Hallyn
2015-12-26 21:51     ` Serge E. Hallyn
2015-12-27  2:03       ` Andy Lutomirski
2016-01-04 15:03         ` Josh Boyer
2016-01-06  1:17           ` Eric W. Biederman
2016-01-06  1:36             ` Andy Lutomirski
2016-01-06  2:04               ` Eric W. Biederman
2015-12-26 20:23   ` Serge E. Hallyn
2015-12-26 20:55     ` Jann Horn
2015-12-26 21:08       ` Serge E. Hallyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).