linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jann@thejh.net>
To: "Serge E. Hallyn" <serge.hallyn@ubuntu.com>
Cc: Roland McGrath <roland@hack.frob.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-kernel@vger.kernel.org, security@kernel.org,
	Serge Hallyn <serge.hallyn@canonical.com>,
	Andy Lutomirski <luto@amacapital.net>,
	"Eric W. Biederman" <ebiederm@xmission.com>
Subject: Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids
Date: Sat, 26 Dec 2015 21:55:50 +0100	[thread overview]
Message-ID: <20151226205550.GA29895@pc.thejh.net> (raw)
In-Reply-To: <20151226202345.GA19815@mail.hallyn.com>

[-- Attachment #1: Type: text/plain, Size: 4625 bytes --]

On Sat, Dec 26, 2015 at 02:23:45PM -0600, Serge E. Hallyn wrote:
> On Sat, Dec 26, 2015 at 02:10:38AM +0100, Jann Horn wrote:
> > On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote:
> > > With this change, the entering process can first enter the
> > > namespace and then safely inspect the namespace's
> > > properties, e.g. through /proc/self/{uid_map,gid_map},
> > > assuming that the namespace owner doesn't have access to
> > > uid 0.
> > 
> > Actually, I think I missed something there. Well, at least it
> > should not directly lead to a container escape.
> > 
> > 
> > > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
> > > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode)
> > >  {
> > > +	struct user_namespace *tns = tcred->user_ns;
> > > +	struct user_namespace *curns = current_cred()->user_ns;
> > > +
> > > +	/* When a root-owned process enters a user namespace created by a
> > > +	 * malicious user, the user shouldn't be able to execute code under
> > > +	 * uid 0 by attaching to the root-owned process via ptrace.
> > > +	 * Therefore, similar to the capable_wrt_inode_uidgid() check,
> > > +	 * verify that all the uids and gids of the target process are
> > > +	 * mapped into the current namespace.
> > > +	 * No fsuid/fsgid check because __ptrace_may_access doesn't do it
> > > +	 * either.
> > > +	 */
> > > +	if (!kuid_has_mapping(curns, tcred->euid) ||
> > > +			!kuid_has_mapping(curns, tcred->suid) ||
> > > +			!kuid_has_mapping(curns, tcred->uid)  ||
> > > +			!kgid_has_mapping(curns, tcred->egid) ||
> > > +			!kgid_has_mapping(curns, tcred->sgid) ||
> > > +			!kgid_has_mapping(curns, tcred->gid))
> > > +		return false;
> > > +
> > >  	if (mode & PTRACE_MODE_NOAUDIT)
> > > -		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
> > > +		return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE);
> > >  	else
> > > -		return has_ns_capability(current, ns, CAP_SYS_PTRACE);
> > > +		return has_ns_capability(current, tns, CAP_SYS_PTRACE);
> > >  }
> > 
> > If the namespace owner can run code in the init namespace, the kuids are
> > mapped into curns but he is still capable wrt the target namespace.
> > 
> > I think a proper fix should first determine the highest parent of
> > tcred->user_ns in which the caller still has privs, then do the
> > kxid_has_mapping() checks in there.
> 
> Hi,
> 
> I don't quite follow what you are concerned about.  Based on the new
> patch you sent, I assume it's not the case where the tcred's kuid is
> actually mapped into the container.  So is it the case where I
> unshare a userns which unshares a userns, then setns from the grandparent
> into the child?  And if so, the concern is that if the setns()ing task's
> kuid is mappable all along into the grandhild, then container root should
> be able to ptrace it?

Consider the following scenario:

init_user_ns has a child namespace (I'll call it child_ns).
child_ns is owned by an attacker (child_ns->owner == attacker_kuid).
The attacking process has current_cred()->euid == attacker_kuid and lives
in init_user_ns (which means it's capable in child_ns).

The victim process (with euid==0) enters the namespace, then the attacking
process tries to attach to it. ptrace_has_cap(tcred, mode) would, with my
old patch, still be true because current is capable in tcred->user_ns and
all kuids are mapped into init_user_ns.

The new patch uses the following rule for cap checks:

    The caller is capable relative to a target if a user namespace exists
    for which all of the following conditions are true:

     - The caller has the requested capability inside the namespace.
     - The target is inside the namespace (either directly or in a child).
     - The target's kuids and kgids are mapped into the namespace.

The first rule is enforced by the has_ns_capability(..., tns, ...) or
has_ns_capability_noaudit(..., tns, ...) call at the bottom.

The second rule is implicitly true because tns initially is the target's
user namespace and then moves up through ->parent.

The third rule is enforced by the while loop.


This prevents the attack I described, but e.g. still allows someone who
is capable in init_user_ns to ptrace anything, no matter in what weird
namespace the target is - if a task was ptrace-able for a process before
it called clone(CLONE_NEWUSER) / unshare(CLONE_NEWUSER) /
setns(, CLONE_NEWUSER), it should still be ptrace-able afterwards.
(Unless access was permitted based on the introspection rule.)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-12-26 20:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-12 20:12 [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Jann Horn
2015-12-15  0:26 ` Andy Lutomirski
2015-12-17 18:59 ` Serge E. Hallyn
2015-12-26  1:10 ` Jann Horn
2015-12-26  2:52   ` Jann Horn
2015-12-26 21:17     ` Serge E. Hallyn
2015-12-26 21:27       ` Jann Horn
2015-12-26 21:49         ` Serge E. Hallyn
2015-12-26 21:51     ` Serge E. Hallyn
2015-12-27  2:03       ` Andy Lutomirski
2016-01-04 15:03         ` Josh Boyer
2016-01-06  1:17           ` Eric W. Biederman
2016-01-06  1:36             ` Andy Lutomirski
2016-01-06  2:04               ` Eric W. Biederman
2015-12-26 20:23   ` Serge E. Hallyn
2015-12-26 20:55     ` Jann Horn [this message]
2015-12-26 21:08       ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151226205550.GA29895@pc.thejh.net \
    --to=jann@thejh.net \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=oleg@redhat.com \
    --cc=roland@hack.frob.com \
    --cc=security@kernel.org \
    --cc=serge.hallyn@canonical.com \
    --cc=serge.hallyn@ubuntu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).