All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] capabilities: Ambient capabilities
@ 2015-03-12 18:08 ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-12 18:08 UTC (permalink / raw)
  Cc: Andy Lutomirski, Kees Cook, Christoph Lameter, Serge Hallyn,
	Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, linux-kernel, linux-api, akpm,
	Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn, Markku Savela,
	Jarkko Sakkinen, Michael Kerrisk

Credit where credit is due: this idea comes from Christoph Lameter
with a lot of valuable input from Serge Hallyn.  This patch is
heavily based on Christoph's patch.

===== The status quo =====

On Linux, there are a number of capabilities defined by the kernel.
To perform various privileged tasks, processes can wield
capabilities that they hold.

Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X).  When the kernel checks
for a capability, it checks pE.  The other capability masks serve to
modify what capabilities can be in pE.

Any task can remove capabilities from pE, pP, or pI at any time.  If
a task has a capability in pP, it can add that capability to pE
and/or pI.  If a task has CAP_SETPCAP, then it can add any
capability to pI, and it can remove capabilities from X.

Tasks are not the only things that can have capabilities; files can
also have capabilities.  A file can have no capabilty information at
all [1].  If a file has capability information, then it has a
permitted mask (fP) and an inheritable mask (fI) as well as a single
effective bit (fE) [2].  File capabilities modify the capabilities
of tasks that execve(2) them.

A task that successfully calls execve has its capabilities modified
for the file ultimately being excecuted (i.e. the binary itself if
that binary is ELF or for the interpreter if the binary is a
script.) [3] In the capability evolution rules, for each mask Z, pZ
represents the old value and pZ' represents the new value.  The
rules are:

  pP' = (X & fP) | (pI & fI)
  pI' = pI
  pE' = (fE ? pP' : 0)
  X is unchanged

For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior.  Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set).  For nonroot
users executing binaries with neither setuid nor file caps, fI and
fP are empty and fE is false.

As an extra complication, if you execute a process as nonroot and fE
is set, then the "secure exec" rules are in effect: AT_SECURE gets
set, LD_PRELOAD doesn't work, etc.

This is rather messy.  We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or
file-capped programs to be exploited for privilege escalation.

===== The problem =====

Capability inheritance is basically useless.

If you aren't root and you execute an ordinary binary, fI is zero,
so your capabilities have no effect whatsoever on pP'.  This means
that you can't usefully execute a helper process or a shell command
with elevated capabilities if you aren't root.

On current kernels, you can sort of work around this by setting fI
to the full set for most or all non-setuid executable files.  This
causes pP' = pI for nonroot, and inheritance works.  No one does
this because it's a PITA and it isn't even supported on most
filesystems.

If you try this, you'll discover that every nonroot program ends up
with secure exec rules, breaking many things.

This is a problem that has bitten many people who have tried to use
capabilities for anything useful.

===== The proposed change =====

This patch adds a fifth capability mask called the ambient mask
(pA).  pA does what pI should have done.

pA obeys the invariant that no bit can ever be set in pA if it is
not set in both pP and pI.  Dropping a bit from pP or pI drops that
bit from pA.  This ensures that existing programs that try to drop
capabilities still do so, with a complication.  Because capability
inheritance is so broken, setting KEEPCAPS, using setresuid to
switch to nonroot uids, and calling execve effectively drops
capabilities.  Therefore, setresuid from root to nonroot
unconditionally clears pA.  Processes that don't like this can
re-add bits to pA afterwards.

The capability evolution rules are changed:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (X & fP) | (pI & fI) | pA'
  pI' = pI
  pE' = (fE ? pP' : pA')
  X is unchanged

If you are nonroot but you have a capability, you can add it to pA.
If you do so, your children get that capability in pA, pP, and pE.
For example, you can set pA = CAP_NET_BIND_SERVICE, and your
children can automatically bind low-numbered ports.  Hallelujah!

Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their
namespace) and unprivileged process trees.  This is currently more
or less impossible.  Hallelujah!

You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.

Users with nonzero pA are unlikely to unintentionally leak that
capability.  If they run programs that try to drop privileges,
dropping privileges will still work.

It's worth noting that the degree of paranoia in this patch could
possibly be relaxed without causing serious problems.  Specifically,
if we allowed pA to persist across executing non-pA-aware setuid
binaries and across setresuid, then, naively, the only capabilities
that could leak as a result would be the capabilities in pA, and any
attacker *already* has those capabilities.  This would make me
nervous, though -- setuid binaries that tried to privilege-separate
might fail to do so, and putting CAP_DAC_READ_SEARCH or
CAP_DAC_OVERRIDE into pA could have unexpected side effects.
(Whether these unexpected side effects would be exploitable is an
open question.)  I've therefore taken the more paranoid route.

An alternative would be to either require PR_SET_NO_NEW_PRIVS before
setting ambient capabilities.  I think that this would be annoying
and would make granting otherwise unprivileged users minor ambient
capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much
less useful than it is with this patch.

===== Footnotes =====

[1] Files that are missing the "security.capability" xattr or that
have unrecognized values for that xattr end up with has_cap ==
false.  The code that does that appears to be complicated for no
good reason.

[2] The libcap capability mask parsers and formatters are
dangerously misleading and the documentation is flat-out wrong.  fE
is *not* a mask; it's a single bit.  This has probably confused
every single person who has tried to use file capabilities.

[3] Linux very confusingly processes the script and the interpreter if
applicable, for reasons that escape me.  The results from thinking
about a script's file capabilities and/or setuid bits are mostly discarded.

Cc: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
CC: Ted Ts'o <tytso@mit.edu>
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: akpm@linuxfoundation.org
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

Preliminary userspace code is here:

https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a

fs/proc/array.c              |  5 ++-
 include/linux/cred.h         | 15 +++++++++
 include/uapi/linux/prctl.h   |  6 ++++
 kernel/user_namespace.c      |  1 +
 security/commoncap.c         | 75 ++++++++++++++++++++++++++++++++++++++------
 security/keys/process_keys.c |  1 +
 6 files changed, 92 insertions(+), 11 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 1295a00ca316..bc15356d6551 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header,
 static inline void task_cap(struct seq_file *m, struct task_struct *p)
 {
 	const struct cred *cred;
-	kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset;
+	kernel_cap_t cap_inheritable, cap_permitted, cap_effective,
+			cap_bset, cap_ambient;
 
 	rcu_read_lock();
 	cred = __task_cred(p);
@@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
 	cap_permitted	= cred->cap_permitted;
 	cap_effective	= cred->cap_effective;
 	cap_bset	= cred->cap_bset;
+	cap_ambient	= cred->cap_ambient;
 	rcu_read_unlock();
 
 	render_cap_t(m, "CapInh:\t", &cap_inheritable);
 	render_cap_t(m, "CapPrm:\t", &cap_permitted);
 	render_cap_t(m, "CapEff:\t", &cap_effective);
 	render_cap_t(m, "CapBnd:\t", &cap_bset);
+	render_cap_t(m, "CapAmb:\t", &cap_ambient);
 }
 
 static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 2fb2ca2127ed..a21bcba6ef84 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -122,6 +122,7 @@ struct cred {
 	kernel_cap_t	cap_permitted;	/* caps we're permitted */
 	kernel_cap_t	cap_effective;	/* caps we can actually use */
 	kernel_cap_t	cap_bset;	/* capability bounding set */
+	kernel_cap_t	cap_ambient;	/* Ambient capability set */
 #ifdef CONFIG_KEYS
 	unsigned char	jit_keyring;	/* default keyring to attach requested
 					 * keys to */
@@ -197,6 +198,20 @@ static inline void validate_process_creds(void)
 }
 #endif
 
+static inline void cap_enforce_ambient_invariants(struct cred *cred)
+{
+	cred->cap_ambient = cap_intersect(cred->cap_ambient,
+					  cap_intersect(cred->cap_permitted,
+							cred->cap_inheritable));
+}
+
+static inline bool cap_ambient_invariant_ok(const struct cred *cred)
+{
+	return cap_issubset(cred->cap_ambient,
+			    cap_intersect(cred->cap_permitted,
+					  cred->cap_inheritable));
+}
+
 /**
  * get_new_cred - Get a reference on a new set of credentials
  * @cred: The new credentials to reference
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 31891d9535e2..65407f867e82 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -190,4 +190,10 @@ struct prctl_mm_map {
 # define PR_FP_MODE_FR		(1 << 0)	/* 64b FP registers */
 # define PR_FP_MODE_FRE		(1 << 1)	/* 32b compatibility */
 
+/* Control the ambient capability set */
+#define PR_CAP_AMBIENT		47
+# define PR_CAP_AMBIENT_GET	1
+# define PR_CAP_AMBIENT_RAISE	2
+# define PR_CAP_AMBIENT_LOWER	3
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4109f8320684..dab0f808235a 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
 	cred->cap_inheritable = CAP_EMPTY_SET;
 	cred->cap_permitted = CAP_FULL_SET;
 	cred->cap_effective = CAP_FULL_SET;
+	cred->cap_ambient = CAP_EMPTY_SET;
 	cred->cap_bset = CAP_FULL_SET;
 #ifdef CONFIG_KEYS
 	key_put(cred->request_key_auth);
diff --git a/security/commoncap.c b/security/commoncap.c
index f66713bd7450..b3253886ecad 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -272,6 +272,7 @@ int cap_capset(struct cred *new,
 	new->cap_effective   = *effective;
 	new->cap_inheritable = *inheritable;
 	new->cap_permitted   = *permitted;
+	cap_enforce_ambient_invariants(new);
 	return 0;
 }
 
@@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps,
 
 		/*
 		 * pP' = (X & fP) | (pI & fI)
+		 * The addition of pA' is handled later.
 		 */
 		new->cap_permitted.cap[i] =
 			(new->cap_bset.cap[i] & permitted) |
@@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
 {
 	const struct cred *old = current_cred();
 	struct cred *new = bprm->cred;
-	bool effective, has_cap = false;
+	bool effective, has_cap = false, is_setid;
 	int ret;
 	kuid_t root_uid;
 
+	BUG_ON(!cap_ambient_invariant_ok(old));
+
 	effective = false;
 	ret = get_file_caps(bprm, &effective, &has_cap);
 	if (ret < 0)
@@ -527,8 +531,9 @@ skip:
 	 *
 	 * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
 	 */
-	if ((!uid_eq(new->euid, old->uid) ||
-	     !gid_eq(new->egid, old->gid) ||
+	is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid);
+
+	if ((is_setid ||
 	     !cap_issubset(new->cap_permitted, old->cap_permitted)) &&
 	    bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) {
 		/* downgrade; they get no more than they had, and maybe less */
@@ -544,10 +549,23 @@ skip:
 	new->suid = new->fsuid = new->euid;
 	new->sgid = new->fsgid = new->egid;
 
+	/* File caps or setid cancel ambient. */
+	if (has_cap || is_setid)
+		cap_clear(new->cap_ambient);
+
+	/*
+	 * Now that we've computed pA', update pP' to give:
+	 *   pP' = (X & fP) | (pI & fI) | pA'
+	 */
+	new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient);
+
 	if (effective)
 		new->cap_effective = new->cap_permitted;
 	else
-		cap_clear(new->cap_effective);
+		new->cap_effective = new->cap_ambient;
+
+	BUG_ON(!cap_ambient_invariant_ok(new));
+
 	bprm->cap_effective = effective;
 
 	/*
@@ -562,7 +580,7 @@ skip:
 	 * Number 1 above might fail if you don't have a full bset, but I think
 	 * that is interesting information to audit.
 	 */
-	if (!cap_isclear(new->cap_effective)) {
+	if (!cap_issubset(new->cap_effective, new->cap_ambient)) {
 		if (!cap_issubset(CAP_FULL_SET, new->cap_effective) ||
 		    !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) ||
 		    issecure(SECURE_NOROOT)) {
@@ -573,6 +591,9 @@ skip:
 	}
 
 	new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
+
+	BUG_ON(!cap_ambient_invariant_ok(new));
+
 	return 0;
 }
 
@@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
 	if (!uid_eq(cred->uid, root_uid)) {
 		if (bprm->cap_effective)
 			return 1;
-		if (!cap_isclear(cred->cap_permitted))
+		if (!cap_issubset(cred->cap_permitted, cred->cap_ambient))
 			return 1;
 	}
 
@@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old)
 	     uid_eq(old->suid, root_uid)) &&
 	    (!uid_eq(new->uid, root_uid) &&
 	     !uid_eq(new->euid, root_uid) &&
-	     !uid_eq(new->suid, root_uid)) &&
-	    !issecure(SECURE_KEEP_CAPS)) {
-		cap_clear(new->cap_permitted);
-		cap_clear(new->cap_effective);
+	     !uid_eq(new->suid, root_uid))) {
+		if (!issecure(SECURE_KEEP_CAPS)) {
+			cap_clear(new->cap_permitted);
+			cap_clear(new->cap_effective);
+		}
+
+		/*
+		 * Pre-ambient programs except setresuid to nonroot followed
+		 * by exec to drop capabilities.  We should make sure that
+		 * this remains the case.
+		 */
+		cap_clear(new->cap_ambient);
 	}
 	if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid))
 		cap_clear(new->cap_effective);
@@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
 		return commit_creds(new);
 
+	case PR_CAP_AMBIENT:
+		if (((!cap_valid(arg3)) | arg4 | arg5))
+			return -EINVAL;
+
+		if (arg2 == PR_CAP_AMBIENT_GET) {
+			return !!cap_raised(current_cred()->cap_ambient, arg3);
+		} else if (arg2 != PR_CAP_AMBIENT_RAISE &&
+			   arg2 != PR_CAP_AMBIENT_LOWER) {
+			return -EINVAL;
+		} else {
+			if (arg2 == PR_CAP_AMBIENT_RAISE &&
+			    (!cap_raised(current_cred()->cap_permitted, arg3) ||
+			     !cap_raised(current_cred()->cap_inheritable,
+					 arg3)))
+				return -EPERM;
+
+			new = prepare_creds();
+			if (!new)
+				return -ENOMEM;
+			if (arg2 == PR_CAP_AMBIENT_RAISE)
+				cap_raise(new->cap_ambient, arg3);
+			else
+				cap_lower(new->cap_ambient, arg3);
+			return commit_creds(new);
+		}
+
 	default:
 		/* No functionality available - continue with default */
 		return -ENOSYS;
diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index bd536cb221e2..43b4cddbf2b3 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork)
 	new->cap_inheritable	= old->cap_inheritable;
 	new->cap_permitted	= old->cap_permitted;
 	new->cap_effective	= old->cap_effective;
+	new->cap_ambient	= old->cap_ambient;
 	new->cap_bset		= old->cap_bset;
 
 	new->jit_keyring	= old->jit_keyring;
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [RFC] capabilities: Ambient capabilities
@ 2015-03-12 18:08 ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-12 18:08 UTC (permalink / raw)
  Cc: Andy Lutomirski, Kees Cook, Christoph Lameter, Serge Hallyn,
	Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	akpm-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Andrew G. Morgan,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

Credit where credit is due: this idea comes from Christoph Lameter
with a lot of valuable input from Serge Hallyn.  This patch is
heavily based on Christoph's patch.

===== The status quo =====

On Linux, there are a number of capabilities defined by the kernel.
To perform various privileged tasks, processes can wield
capabilities that they hold.

Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X).  When the kernel checks
for a capability, it checks pE.  The other capability masks serve to
modify what capabilities can be in pE.

Any task can remove capabilities from pE, pP, or pI at any time.  If
a task has a capability in pP, it can add that capability to pE
and/or pI.  If a task has CAP_SETPCAP, then it can add any
capability to pI, and it can remove capabilities from X.

Tasks are not the only things that can have capabilities; files can
also have capabilities.  A file can have no capabilty information at
all [1].  If a file has capability information, then it has a
permitted mask (fP) and an inheritable mask (fI) as well as a single
effective bit (fE) [2].  File capabilities modify the capabilities
of tasks that execve(2) them.

A task that successfully calls execve has its capabilities modified
for the file ultimately being excecuted (i.e. the binary itself if
that binary is ELF or for the interpreter if the binary is a
script.) [3] In the capability evolution rules, for each mask Z, pZ
represents the old value and pZ' represents the new value.  The
rules are:

  pP' = (X & fP) | (pI & fI)
  pI' = pI
  pE' = (fE ? pP' : 0)
  X is unchanged

For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior.  Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set).  For nonroot
users executing binaries with neither setuid nor file caps, fI and
fP are empty and fE is false.

As an extra complication, if you execute a process as nonroot and fE
is set, then the "secure exec" rules are in effect: AT_SECURE gets
set, LD_PRELOAD doesn't work, etc.

This is rather messy.  We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or
file-capped programs to be exploited for privilege escalation.

===== The problem =====

Capability inheritance is basically useless.

If you aren't root and you execute an ordinary binary, fI is zero,
so your capabilities have no effect whatsoever on pP'.  This means
that you can't usefully execute a helper process or a shell command
with elevated capabilities if you aren't root.

On current kernels, you can sort of work around this by setting fI
to the full set for most or all non-setuid executable files.  This
causes pP' = pI for nonroot, and inheritance works.  No one does
this because it's a PITA and it isn't even supported on most
filesystems.

If you try this, you'll discover that every nonroot program ends up
with secure exec rules, breaking many things.

This is a problem that has bitten many people who have tried to use
capabilities for anything useful.

===== The proposed change =====

This patch adds a fifth capability mask called the ambient mask
(pA).  pA does what pI should have done.

pA obeys the invariant that no bit can ever be set in pA if it is
not set in both pP and pI.  Dropping a bit from pP or pI drops that
bit from pA.  This ensures that existing programs that try to drop
capabilities still do so, with a complication.  Because capability
inheritance is so broken, setting KEEPCAPS, using setresuid to
switch to nonroot uids, and calling execve effectively drops
capabilities.  Therefore, setresuid from root to nonroot
unconditionally clears pA.  Processes that don't like this can
re-add bits to pA afterwards.

The capability evolution rules are changed:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (X & fP) | (pI & fI) | pA'
  pI' = pI
  pE' = (fE ? pP' : pA')
  X is unchanged

If you are nonroot but you have a capability, you can add it to pA.
If you do so, your children get that capability in pA, pP, and pE.
For example, you can set pA = CAP_NET_BIND_SERVICE, and your
children can automatically bind low-numbered ports.  Hallelujah!

Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their
namespace) and unprivileged process trees.  This is currently more
or less impossible.  Hallelujah!

You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.

Users with nonzero pA are unlikely to unintentionally leak that
capability.  If they run programs that try to drop privileges,
dropping privileges will still work.

It's worth noting that the degree of paranoia in this patch could
possibly be relaxed without causing serious problems.  Specifically,
if we allowed pA to persist across executing non-pA-aware setuid
binaries and across setresuid, then, naively, the only capabilities
that could leak as a result would be the capabilities in pA, and any
attacker *already* has those capabilities.  This would make me
nervous, though -- setuid binaries that tried to privilege-separate
might fail to do so, and putting CAP_DAC_READ_SEARCH or
CAP_DAC_OVERRIDE into pA could have unexpected side effects.
(Whether these unexpected side effects would be exploitable is an
open question.)  I've therefore taken the more paranoid route.

An alternative would be to either require PR_SET_NO_NEW_PRIVS before
setting ambient capabilities.  I think that this would be annoying
and would make granting otherwise unprivileged users minor ambient
capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much
less useful than it is with this patch.

===== Footnotes =====

[1] Files that are missing the "security.capability" xattr or that
have unrecognized values for that xattr end up with has_cap ==
false.  The code that does that appears to be complicated for no
good reason.

[2] The libcap capability mask parsers and formatters are
dangerously misleading and the documentation is flat-out wrong.  fE
is *not* a mask; it's a single bit.  This has probably confused
every single person who has tried to use file capabilities.

[3] Linux very confusingly processes the script and the interpreter if
applicable, for reasons that escape me.  The results from thinking
about a script's file capabilities and/or setuid bits are mostly discarded.

Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Cc: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
Cc: Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>
Cc: Aaron Jones <aaronmdjones-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: Ted Ts'o <tytso-3s7WtUTddSA@public.gmane.org>
Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: akpm-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org
Cc: Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Mimi Zohar <zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Austin S Hemmelgarn <ahferroin7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Markku Savela <msa-kXoF896ld44xHbG02/KK1g@public.gmane.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---

Preliminary userspace code is here:

https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a

fs/proc/array.c              |  5 ++-
 include/linux/cred.h         | 15 +++++++++
 include/uapi/linux/prctl.h   |  6 ++++
 kernel/user_namespace.c      |  1 +
 security/commoncap.c         | 75 ++++++++++++++++++++++++++++++++++++++------
 security/keys/process_keys.c |  1 +
 6 files changed, 92 insertions(+), 11 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 1295a00ca316..bc15356d6551 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header,
 static inline void task_cap(struct seq_file *m, struct task_struct *p)
 {
 	const struct cred *cred;
-	kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset;
+	kernel_cap_t cap_inheritable, cap_permitted, cap_effective,
+			cap_bset, cap_ambient;
 
 	rcu_read_lock();
 	cred = __task_cred(p);
@@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
 	cap_permitted	= cred->cap_permitted;
 	cap_effective	= cred->cap_effective;
 	cap_bset	= cred->cap_bset;
+	cap_ambient	= cred->cap_ambient;
 	rcu_read_unlock();
 
 	render_cap_t(m, "CapInh:\t", &cap_inheritable);
 	render_cap_t(m, "CapPrm:\t", &cap_permitted);
 	render_cap_t(m, "CapEff:\t", &cap_effective);
 	render_cap_t(m, "CapBnd:\t", &cap_bset);
+	render_cap_t(m, "CapAmb:\t", &cap_ambient);
 }
 
 static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 2fb2ca2127ed..a21bcba6ef84 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -122,6 +122,7 @@ struct cred {
 	kernel_cap_t	cap_permitted;	/* caps we're permitted */
 	kernel_cap_t	cap_effective;	/* caps we can actually use */
 	kernel_cap_t	cap_bset;	/* capability bounding set */
+	kernel_cap_t	cap_ambient;	/* Ambient capability set */
 #ifdef CONFIG_KEYS
 	unsigned char	jit_keyring;	/* default keyring to attach requested
 					 * keys to */
@@ -197,6 +198,20 @@ static inline void validate_process_creds(void)
 }
 #endif
 
+static inline void cap_enforce_ambient_invariants(struct cred *cred)
+{
+	cred->cap_ambient = cap_intersect(cred->cap_ambient,
+					  cap_intersect(cred->cap_permitted,
+							cred->cap_inheritable));
+}
+
+static inline bool cap_ambient_invariant_ok(const struct cred *cred)
+{
+	return cap_issubset(cred->cap_ambient,
+			    cap_intersect(cred->cap_permitted,
+					  cred->cap_inheritable));
+}
+
 /**
  * get_new_cred - Get a reference on a new set of credentials
  * @cred: The new credentials to reference
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 31891d9535e2..65407f867e82 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -190,4 +190,10 @@ struct prctl_mm_map {
 # define PR_FP_MODE_FR		(1 << 0)	/* 64b FP registers */
 # define PR_FP_MODE_FRE		(1 << 1)	/* 32b compatibility */
 
+/* Control the ambient capability set */
+#define PR_CAP_AMBIENT		47
+# define PR_CAP_AMBIENT_GET	1
+# define PR_CAP_AMBIENT_RAISE	2
+# define PR_CAP_AMBIENT_LOWER	3
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4109f8320684..dab0f808235a 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
 	cred->cap_inheritable = CAP_EMPTY_SET;
 	cred->cap_permitted = CAP_FULL_SET;
 	cred->cap_effective = CAP_FULL_SET;
+	cred->cap_ambient = CAP_EMPTY_SET;
 	cred->cap_bset = CAP_FULL_SET;
 #ifdef CONFIG_KEYS
 	key_put(cred->request_key_auth);
diff --git a/security/commoncap.c b/security/commoncap.c
index f66713bd7450..b3253886ecad 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -272,6 +272,7 @@ int cap_capset(struct cred *new,
 	new->cap_effective   = *effective;
 	new->cap_inheritable = *inheritable;
 	new->cap_permitted   = *permitted;
+	cap_enforce_ambient_invariants(new);
 	return 0;
 }
 
@@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps,
 
 		/*
 		 * pP' = (X & fP) | (pI & fI)
+		 * The addition of pA' is handled later.
 		 */
 		new->cap_permitted.cap[i] =
 			(new->cap_bset.cap[i] & permitted) |
@@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
 {
 	const struct cred *old = current_cred();
 	struct cred *new = bprm->cred;
-	bool effective, has_cap = false;
+	bool effective, has_cap = false, is_setid;
 	int ret;
 	kuid_t root_uid;
 
+	BUG_ON(!cap_ambient_invariant_ok(old));
+
 	effective = false;
 	ret = get_file_caps(bprm, &effective, &has_cap);
 	if (ret < 0)
@@ -527,8 +531,9 @@ skip:
 	 *
 	 * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
 	 */
-	if ((!uid_eq(new->euid, old->uid) ||
-	     !gid_eq(new->egid, old->gid) ||
+	is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid);
+
+	if ((is_setid ||
 	     !cap_issubset(new->cap_permitted, old->cap_permitted)) &&
 	    bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) {
 		/* downgrade; they get no more than they had, and maybe less */
@@ -544,10 +549,23 @@ skip:
 	new->suid = new->fsuid = new->euid;
 	new->sgid = new->fsgid = new->egid;
 
+	/* File caps or setid cancel ambient. */
+	if (has_cap || is_setid)
+		cap_clear(new->cap_ambient);
+
+	/*
+	 * Now that we've computed pA', update pP' to give:
+	 *   pP' = (X & fP) | (pI & fI) | pA'
+	 */
+	new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient);
+
 	if (effective)
 		new->cap_effective = new->cap_permitted;
 	else
-		cap_clear(new->cap_effective);
+		new->cap_effective = new->cap_ambient;
+
+	BUG_ON(!cap_ambient_invariant_ok(new));
+
 	bprm->cap_effective = effective;
 
 	/*
@@ -562,7 +580,7 @@ skip:
 	 * Number 1 above might fail if you don't have a full bset, but I think
 	 * that is interesting information to audit.
 	 */
-	if (!cap_isclear(new->cap_effective)) {
+	if (!cap_issubset(new->cap_effective, new->cap_ambient)) {
 		if (!cap_issubset(CAP_FULL_SET, new->cap_effective) ||
 		    !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) ||
 		    issecure(SECURE_NOROOT)) {
@@ -573,6 +591,9 @@ skip:
 	}
 
 	new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
+
+	BUG_ON(!cap_ambient_invariant_ok(new));
+
 	return 0;
 }
 
@@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
 	if (!uid_eq(cred->uid, root_uid)) {
 		if (bprm->cap_effective)
 			return 1;
-		if (!cap_isclear(cred->cap_permitted))
+		if (!cap_issubset(cred->cap_permitted, cred->cap_ambient))
 			return 1;
 	}
 
@@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old)
 	     uid_eq(old->suid, root_uid)) &&
 	    (!uid_eq(new->uid, root_uid) &&
 	     !uid_eq(new->euid, root_uid) &&
-	     !uid_eq(new->suid, root_uid)) &&
-	    !issecure(SECURE_KEEP_CAPS)) {
-		cap_clear(new->cap_permitted);
-		cap_clear(new->cap_effective);
+	     !uid_eq(new->suid, root_uid))) {
+		if (!issecure(SECURE_KEEP_CAPS)) {
+			cap_clear(new->cap_permitted);
+			cap_clear(new->cap_effective);
+		}
+
+		/*
+		 * Pre-ambient programs except setresuid to nonroot followed
+		 * by exec to drop capabilities.  We should make sure that
+		 * this remains the case.
+		 */
+		cap_clear(new->cap_ambient);
 	}
 	if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid))
 		cap_clear(new->cap_effective);
@@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
 		return commit_creds(new);
 
+	case PR_CAP_AMBIENT:
+		if (((!cap_valid(arg3)) | arg4 | arg5))
+			return -EINVAL;
+
+		if (arg2 == PR_CAP_AMBIENT_GET) {
+			return !!cap_raised(current_cred()->cap_ambient, arg3);
+		} else if (arg2 != PR_CAP_AMBIENT_RAISE &&
+			   arg2 != PR_CAP_AMBIENT_LOWER) {
+			return -EINVAL;
+		} else {
+			if (arg2 == PR_CAP_AMBIENT_RAISE &&
+			    (!cap_raised(current_cred()->cap_permitted, arg3) ||
+			     !cap_raised(current_cred()->cap_inheritable,
+					 arg3)))
+				return -EPERM;
+
+			new = prepare_creds();
+			if (!new)
+				return -ENOMEM;
+			if (arg2 == PR_CAP_AMBIENT_RAISE)
+				cap_raise(new->cap_ambient, arg3);
+			else
+				cap_lower(new->cap_ambient, arg3);
+			return commit_creds(new);
+		}
+
 	default:
 		/* No functionality available - continue with default */
 		return -ENOSYS;
diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index bd536cb221e2..43b4cddbf2b3 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork)
 	new->cap_inheritable	= old->cap_inheritable;
 	new->cap_permitted	= old->cap_permitted;
 	new->cap_effective	= old->cap_effective;
+	new->cap_ambient	= old->cap_ambient;
 	new->cap_bset		= old->cap_bset;
 
 	new->jit_keyring	= old->jit_keyring;
-- 
2.3.0

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-12 18:08 ` Andy Lutomirski
  (?)
@ 2015-03-12 21:49 ` Kees Cook
  2015-03-12 22:10   ` Andrew G. Morgan
  -1 siblings, 1 reply; 55+ messages in thread
From: Kees Cook @ 2015-03-12 21:49 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Christoph Lameter, Serge Hallyn, Andy Lutomirski,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, akpm, Andrew G. Morgan,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Thu, Mar 12, 2015 at 11:08 AM, Andy Lutomirski <luto@kernel.org> wrote:
> Credit where credit is due: this idea comes from Christoph Lameter
> with a lot of valuable input from Serge Hallyn.  This patch is
> heavily based on Christoph's patch.
>
> ===== The status quo =====
>
> On Linux, there are a number of capabilities defined by the kernel.
> To perform various privileged tasks, processes can wield
> capabilities that they hold.
>
> Each task has four capability masks: effective (pE), permitted (pP),
> inheritable (pI), and a bounding set (X).  When the kernel checks
> for a capability, it checks pE.  The other capability masks serve to
> modify what capabilities can be in pE.
>
> Any task can remove capabilities from pE, pP, or pI at any time.  If
> a task has a capability in pP, it can add that capability to pE
> and/or pI.  If a task has CAP_SETPCAP, then it can add any
> capability to pI, and it can remove capabilities from X.
>
> Tasks are not the only things that can have capabilities; files can
> also have capabilities.  A file can have no capabilty information at
> all [1].  If a file has capability information, then it has a
> permitted mask (fP) and an inheritable mask (fI) as well as a single
> effective bit (fE) [2].  File capabilities modify the capabilities
> of tasks that execve(2) them.
>
> A task that successfully calls execve has its capabilities modified
> for the file ultimately being excecuted (i.e. the binary itself if
> that binary is ELF or for the interpreter if the binary is a
> script.) [3] In the capability evolution rules, for each mask Z, pZ
> represents the old value and pZ' represents the new value.  The
> rules are:
>
>   pP' = (X & fP) | (pI & fI)
>   pI' = pI
>   pE' = (fE ? pP' : 0)
>   X is unchanged
>
> For setuid binaries, fP, fI, and fE are modified by a moderately
> complicated set of rules that emulate POSIX behavior.  Similarly, if
> euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
> (primary, fP and fI usually end up being the full set).  For nonroot
> users executing binaries with neither setuid nor file caps, fI and
> fP are empty and fE is false.
>
> As an extra complication, if you execute a process as nonroot and fE
> is set, then the "secure exec" rules are in effect: AT_SECURE gets
> set, LD_PRELOAD doesn't work, etc.
>
> This is rather messy.  We've learned that making any changes is
> dangerous, though: if a new kernel version allows an unprivileged
> program to change its security state in a way that persists cross
> execution of a setuid program or a program with file caps, this
> persistent state is surprisingly likely to allow setuid or
> file-capped programs to be exploited for privilege escalation.
>
> ===== The problem =====
>
> Capability inheritance is basically useless.
>
> If you aren't root and you execute an ordinary binary, fI is zero,
> so your capabilities have no effect whatsoever on pP'.  This means
> that you can't usefully execute a helper process or a shell command
> with elevated capabilities if you aren't root.
>
> On current kernels, you can sort of work around this by setting fI
> to the full set for most or all non-setuid executable files.  This
> causes pP' = pI for nonroot, and inheritance works.  No one does
> this because it's a PITA and it isn't even supported on most
> filesystems.
>
> If you try this, you'll discover that every nonroot program ends up
> with secure exec rules, breaking many things.
>
> This is a problem that has bitten many people who have tried to use
> capabilities for anything useful.
>
> ===== The proposed change =====
>
> This patch adds a fifth capability mask called the ambient mask
> (pA).  pA does what pI should have done.
>
> pA obeys the invariant that no bit can ever be set in pA if it is
> not set in both pP and pI.  Dropping a bit from pP or pI drops that
> bit from pA.  This ensures that existing programs that try to drop
> capabilities still do so, with a complication.  Because capability
> inheritance is so broken, setting KEEPCAPS, using setresuid to
> switch to nonroot uids, and calling execve effectively drops
> capabilities.  Therefore, setresuid from root to nonroot
> unconditionally clears pA.  Processes that don't like this can
> re-add bits to pA afterwards.
>
> The capability evolution rules are changed:
>
>   pA' = (file caps or setuid or setgid ? 0 : pA)
>   pP' = (X & fP) | (pI & fI) | pA'
>   pI' = pI
>   pE' = (fE ? pP' : pA')
>   X is unchanged
>
> If you are nonroot but you have a capability, you can add it to pA.
> If you do so, your children get that capability in pA, pP, and pE.
> For example, you can set pA = CAP_NET_BIND_SERVICE, and your
> children can automatically bind low-numbered ports.  Hallelujah!
>
> Unprivileged users can create user namespaces, map themselves to a
> nonzero uid, and create both privileged (relative to their
> namespace) and unprivileged process trees.  This is currently more
> or less impossible.  Hallelujah!
>
> You cannot use pA to try to subvert a setuid, setgid, or file-capped
> program: if you execute any such program, pA gets cleared and the
> resulting evolution rules are unchanged by this patch.
>
> Users with nonzero pA are unlikely to unintentionally leak that
> capability.  If they run programs that try to drop privileges,
> dropping privileges will still work.
>
> It's worth noting that the degree of paranoia in this patch could
> possibly be relaxed without causing serious problems.  Specifically,
> if we allowed pA to persist across executing non-pA-aware setuid
> binaries and across setresuid, then, naively, the only capabilities
> that could leak as a result would be the capabilities in pA, and any
> attacker *already* has those capabilities.  This would make me
> nervous, though -- setuid binaries that tried to privilege-separate
> might fail to do so, and putting CAP_DAC_READ_SEARCH or
> CAP_DAC_OVERRIDE into pA could have unexpected side effects.
> (Whether these unexpected side effects would be exploitable is an
> open question.)  I've therefore taken the more paranoid route.
>
> An alternative would be to either require PR_SET_NO_NEW_PRIVS before
> setting ambient capabilities.  I think that this would be annoying
> and would make granting otherwise unprivileged users minor ambient
> capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much
> less useful than it is with this patch.
>
> ===== Footnotes =====
>
> [1] Files that are missing the "security.capability" xattr or that
> have unrecognized values for that xattr end up with has_cap ==
> false.  The code that does that appears to be complicated for no
> good reason.
>
> [2] The libcap capability mask parsers and formatters are
> dangerously misleading and the documentation is flat-out wrong.  fE
> is *not* a mask; it's a single bit.  This has probably confused
> every single person who has tried to use file capabilities.
>
> [3] Linux very confusingly processes the script and the interpreter if
> applicable, for reasons that escape me.  The results from thinking
> about a script's file capabilities and/or setuid bits are mostly discarded.
>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Serge Hallyn <serge.hallyn@canonical.com>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Aaron Jones <aaronmdjones@gmail.com>
> CC: Ted Ts'o <tytso@mit.edu>
> Cc: linux-security-module@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-api@vger.kernel.org
> Cc: akpm@linuxfoundation.org
> Cc: Andrew G. Morgan <morgan@kernel.org>
> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
> Cc: Markku Savela <msa@moth.iki.fi>
> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> Cc: Michael Kerrisk <mtk.manpages@gmail.com>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>

This would be quite welcome for things we're doing in Chrome OS.
Presently, we're able to use fscaps to keep non-root caps across exec
and haven't encountered issues with AT_SECURE (yet), but using pA
would be much nicer and exactly matches how we want to use it: a
launcher is creating a tree of processes that are non-root but need
some capabilities. Right now the tree is very small and we're able to
sprinkle our fscaps lightly. :) This would be better.

-Kees

> ---
>
> Preliminary userspace code is here:
>
> https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a
>
> fs/proc/array.c              |  5 ++-
>  include/linux/cred.h         | 15 +++++++++
>  include/uapi/linux/prctl.h   |  6 ++++
>  kernel/user_namespace.c      |  1 +
>  security/commoncap.c         | 75 ++++++++++++++++++++++++++++++++++++++------
>  security/keys/process_keys.c |  1 +
>  6 files changed, 92 insertions(+), 11 deletions(-)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 1295a00ca316..bc15356d6551 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header,
>  static inline void task_cap(struct seq_file *m, struct task_struct *p)
>  {
>         const struct cred *cred;
> -       kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset;
> +       kernel_cap_t cap_inheritable, cap_permitted, cap_effective,
> +                       cap_bset, cap_ambient;
>
>         rcu_read_lock();
>         cred = __task_cred(p);
> @@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
>         cap_permitted   = cred->cap_permitted;
>         cap_effective   = cred->cap_effective;
>         cap_bset        = cred->cap_bset;
> +       cap_ambient     = cred->cap_ambient;
>         rcu_read_unlock();
>
>         render_cap_t(m, "CapInh:\t", &cap_inheritable);
>         render_cap_t(m, "CapPrm:\t", &cap_permitted);
>         render_cap_t(m, "CapEff:\t", &cap_effective);
>         render_cap_t(m, "CapBnd:\t", &cap_bset);
> +       render_cap_t(m, "CapAmb:\t", &cap_ambient);
>  }
>
>  static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
> diff --git a/include/linux/cred.h b/include/linux/cred.h
> index 2fb2ca2127ed..a21bcba6ef84 100644
> --- a/include/linux/cred.h
> +++ b/include/linux/cred.h
> @@ -122,6 +122,7 @@ struct cred {
>         kernel_cap_t    cap_permitted;  /* caps we're permitted */
>         kernel_cap_t    cap_effective;  /* caps we can actually use */
>         kernel_cap_t    cap_bset;       /* capability bounding set */
> +       kernel_cap_t    cap_ambient;    /* Ambient capability set */
>  #ifdef CONFIG_KEYS
>         unsigned char   jit_keyring;    /* default keyring to attach requested
>                                          * keys to */
> @@ -197,6 +198,20 @@ static inline void validate_process_creds(void)
>  }
>  #endif
>
> +static inline void cap_enforce_ambient_invariants(struct cred *cred)
> +{
> +       cred->cap_ambient = cap_intersect(cred->cap_ambient,
> +                                         cap_intersect(cred->cap_permitted,
> +                                                       cred->cap_inheritable));
> +}
> +
> +static inline bool cap_ambient_invariant_ok(const struct cred *cred)
> +{
> +       return cap_issubset(cred->cap_ambient,
> +                           cap_intersect(cred->cap_permitted,
> +                                         cred->cap_inheritable));
> +}
> +
>  /**
>   * get_new_cred - Get a reference on a new set of credentials
>   * @cred: The new credentials to reference
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 31891d9535e2..65407f867e82 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -190,4 +190,10 @@ struct prctl_mm_map {
>  # define PR_FP_MODE_FR         (1 << 0)        /* 64b FP registers */
>  # define PR_FP_MODE_FRE                (1 << 1)        /* 32b compatibility */
>
> +/* Control the ambient capability set */
> +#define PR_CAP_AMBIENT         47
> +# define PR_CAP_AMBIENT_GET    1
> +# define PR_CAP_AMBIENT_RAISE  2
> +# define PR_CAP_AMBIENT_LOWER  3
> +
>  #endif /* _LINUX_PRCTL_H */
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 4109f8320684..dab0f808235a 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
>         cred->cap_inheritable = CAP_EMPTY_SET;
>         cred->cap_permitted = CAP_FULL_SET;
>         cred->cap_effective = CAP_FULL_SET;
> +       cred->cap_ambient = CAP_EMPTY_SET;
>         cred->cap_bset = CAP_FULL_SET;
>  #ifdef CONFIG_KEYS
>         key_put(cred->request_key_auth);
> diff --git a/security/commoncap.c b/security/commoncap.c
> index f66713bd7450..b3253886ecad 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -272,6 +272,7 @@ int cap_capset(struct cred *new,
>         new->cap_effective   = *effective;
>         new->cap_inheritable = *inheritable;
>         new->cap_permitted   = *permitted;
> +       cap_enforce_ambient_invariants(new);
>         return 0;
>  }
>
> @@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps,
>
>                 /*
>                  * pP' = (X & fP) | (pI & fI)
> +                * The addition of pA' is handled later.
>                  */
>                 new->cap_permitted.cap[i] =
>                         (new->cap_bset.cap[i] & permitted) |
> @@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
>  {
>         const struct cred *old = current_cred();
>         struct cred *new = bprm->cred;
> -       bool effective, has_cap = false;
> +       bool effective, has_cap = false, is_setid;
>         int ret;
>         kuid_t root_uid;
>
> +       BUG_ON(!cap_ambient_invariant_ok(old));
> +
>         effective = false;
>         ret = get_file_caps(bprm, &effective, &has_cap);
>         if (ret < 0)
> @@ -527,8 +531,9 @@ skip:
>          *
>          * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
>          */
> -       if ((!uid_eq(new->euid, old->uid) ||
> -            !gid_eq(new->egid, old->gid) ||
> +       is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid);
> +
> +       if ((is_setid ||
>              !cap_issubset(new->cap_permitted, old->cap_permitted)) &&
>             bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) {
>                 /* downgrade; they get no more than they had, and maybe less */
> @@ -544,10 +549,23 @@ skip:
>         new->suid = new->fsuid = new->euid;
>         new->sgid = new->fsgid = new->egid;
>
> +       /* File caps or setid cancel ambient. */
> +       if (has_cap || is_setid)
> +               cap_clear(new->cap_ambient);
> +
> +       /*
> +        * Now that we've computed pA', update pP' to give:
> +        *   pP' = (X & fP) | (pI & fI) | pA'
> +        */
> +       new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient);
> +
>         if (effective)
>                 new->cap_effective = new->cap_permitted;
>         else
> -               cap_clear(new->cap_effective);
> +               new->cap_effective = new->cap_ambient;
> +
> +       BUG_ON(!cap_ambient_invariant_ok(new));
> +
>         bprm->cap_effective = effective;
>
>         /*
> @@ -562,7 +580,7 @@ skip:
>          * Number 1 above might fail if you don't have a full bset, but I think
>          * that is interesting information to audit.
>          */
> -       if (!cap_isclear(new->cap_effective)) {
> +       if (!cap_issubset(new->cap_effective, new->cap_ambient)) {
>                 if (!cap_issubset(CAP_FULL_SET, new->cap_effective) ||
>                     !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) ||
>                     issecure(SECURE_NOROOT)) {
> @@ -573,6 +591,9 @@ skip:
>         }
>
>         new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
> +
> +       BUG_ON(!cap_ambient_invariant_ok(new));
> +
>         return 0;
>  }
>
> @@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
>         if (!uid_eq(cred->uid, root_uid)) {
>                 if (bprm->cap_effective)
>                         return 1;
> -               if (!cap_isclear(cred->cap_permitted))
> +               if (!cap_issubset(cred->cap_permitted, cred->cap_ambient))
>                         return 1;
>         }
>
> @@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old)
>              uid_eq(old->suid, root_uid)) &&
>             (!uid_eq(new->uid, root_uid) &&
>              !uid_eq(new->euid, root_uid) &&
> -            !uid_eq(new->suid, root_uid)) &&
> -           !issecure(SECURE_KEEP_CAPS)) {
> -               cap_clear(new->cap_permitted);
> -               cap_clear(new->cap_effective);
> +            !uid_eq(new->suid, root_uid))) {
> +               if (!issecure(SECURE_KEEP_CAPS)) {
> +                       cap_clear(new->cap_permitted);
> +                       cap_clear(new->cap_effective);
> +               }
> +
> +               /*
> +                * Pre-ambient programs except setresuid to nonroot followed
> +                * by exec to drop capabilities.  We should make sure that
> +                * this remains the case.
> +                */
> +               cap_clear(new->cap_ambient);
>         }
>         if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid))
>                 cap_clear(new->cap_effective);
> @@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>                         new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
>                 return commit_creds(new);
>
> +       case PR_CAP_AMBIENT:
> +               if (((!cap_valid(arg3)) | arg4 | arg5))
> +                       return -EINVAL;
> +
> +               if (arg2 == PR_CAP_AMBIENT_GET) {
> +                       return !!cap_raised(current_cred()->cap_ambient, arg3);
> +               } else if (arg2 != PR_CAP_AMBIENT_RAISE &&
> +                          arg2 != PR_CAP_AMBIENT_LOWER) {
> +                       return -EINVAL;
> +               } else {
> +                       if (arg2 == PR_CAP_AMBIENT_RAISE &&
> +                           (!cap_raised(current_cred()->cap_permitted, arg3) ||
> +                            !cap_raised(current_cred()->cap_inheritable,
> +                                        arg3)))
> +                               return -EPERM;
> +
> +                       new = prepare_creds();
> +                       if (!new)
> +                               return -ENOMEM;
> +                       if (arg2 == PR_CAP_AMBIENT_RAISE)
> +                               cap_raise(new->cap_ambient, arg3);
> +                       else
> +                               cap_lower(new->cap_ambient, arg3);
> +                       return commit_creds(new);
> +               }
> +
>         default:
>                 /* No functionality available - continue with default */
>                 return -ENOSYS;
> diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
> index bd536cb221e2..43b4cddbf2b3 100644
> --- a/security/keys/process_keys.c
> +++ b/security/keys/process_keys.c
> @@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork)
>         new->cap_inheritable    = old->cap_inheritable;
>         new->cap_permitted      = old->cap_permitted;
>         new->cap_effective      = old->cap_effective;
> +       new->cap_ambient        = old->cap_ambient;
>         new->cap_bset           = old->cap_bset;
>
>         new->jit_keyring        = old->jit_keyring;
> --
> 2.3.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-12 21:49 ` Kees Cook
@ 2015-03-12 22:10   ` Andrew G. Morgan
  2015-03-12 22:27       ` Andrew Lutomirski
  0 siblings, 1 reply; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-12 22:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andy Lutomirski, Christoph Lameter, Serge Hallyn,
	Andy Lutomirski, Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

I'm unclear why you refer to the inheritable set in this test:

+               } else {
+                       if (arg2 == PR_CAP_AMBIENT_RAISE &&
+                           (!cap_raised(current_cred()->cap_permitted, arg3) ||
+                            !cap_raised(current_cred()->cap_inheritable,
+                                        arg3)))
+                               return -EPERM;

I'm also unclear how you can turn off this new 'feature' for a process
tree? As it is, the code creates an exploit path for a capable (pP !=
0) program with an exploitable flaw to create a privilege escalation
for an arbitrary child program. While I understand that everyone
'knows what they are doing' in implementing this change, I'm convinced
that folk that are up to no good also do... Why not provide a lockable
secure bit to selectively disable this support?

Nacked-by: Andrew G. Morgan <morgan@kernel.org>

Cheers

Andrew


On Thu, Mar 12, 2015 at 2:49 PM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Mar 12, 2015 at 11:08 AM, Andy Lutomirski <luto@kernel.org> wrote:
>> Credit where credit is due: this idea comes from Christoph Lameter
>> with a lot of valuable input from Serge Hallyn.  This patch is
>> heavily based on Christoph's patch.
>>
>> ===== The status quo =====
>>
>> On Linux, there are a number of capabilities defined by the kernel.
>> To perform various privileged tasks, processes can wield
>> capabilities that they hold.
>>
>> Each task has four capability masks: effective (pE), permitted (pP),
>> inheritable (pI), and a bounding set (X).  When the kernel checks
>> for a capability, it checks pE.  The other capability masks serve to
>> modify what capabilities can be in pE.
>>
>> Any task can remove capabilities from pE, pP, or pI at any time.  If
>> a task has a capability in pP, it can add that capability to pE
>> and/or pI.  If a task has CAP_SETPCAP, then it can add any
>> capability to pI, and it can remove capabilities from X.
>>
>> Tasks are not the only things that can have capabilities; files can
>> also have capabilities.  A file can have no capabilty information at
>> all [1].  If a file has capability information, then it has a
>> permitted mask (fP) and an inheritable mask (fI) as well as a single
>> effective bit (fE) [2].  File capabilities modify the capabilities
>> of tasks that execve(2) them.
>>
>> A task that successfully calls execve has its capabilities modified
>> for the file ultimately being excecuted (i.e. the binary itself if
>> that binary is ELF or for the interpreter if the binary is a
>> script.) [3] In the capability evolution rules, for each mask Z, pZ
>> represents the old value and pZ' represents the new value.  The
>> rules are:
>>
>>   pP' = (X & fP) | (pI & fI)
>>   pI' = pI
>>   pE' = (fE ? pP' : 0)
>>   X is unchanged
>>
>> For setuid binaries, fP, fI, and fE are modified by a moderately
>> complicated set of rules that emulate POSIX behavior.  Similarly, if
>> euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
>> (primary, fP and fI usually end up being the full set).  For nonroot
>> users executing binaries with neither setuid nor file caps, fI and
>> fP are empty and fE is false.
>>
>> As an extra complication, if you execute a process as nonroot and fE
>> is set, then the "secure exec" rules are in effect: AT_SECURE gets
>> set, LD_PRELOAD doesn't work, etc.
>>
>> This is rather messy.  We've learned that making any changes is
>> dangerous, though: if a new kernel version allows an unprivileged
>> program to change its security state in a way that persists cross
>> execution of a setuid program or a program with file caps, this
>> persistent state is surprisingly likely to allow setuid or
>> file-capped programs to be exploited for privilege escalation.
>>
>> ===== The problem =====
>>
>> Capability inheritance is basically useless.
>>
>> If you aren't root and you execute an ordinary binary, fI is zero,
>> so your capabilities have no effect whatsoever on pP'.  This means
>> that you can't usefully execute a helper process or a shell command
>> with elevated capabilities if you aren't root.
>>
>> On current kernels, you can sort of work around this by setting fI
>> to the full set for most or all non-setuid executable files.  This
>> causes pP' = pI for nonroot, and inheritance works.  No one does
>> this because it's a PITA and it isn't even supported on most
>> filesystems.
>>
>> If you try this, you'll discover that every nonroot program ends up
>> with secure exec rules, breaking many things.
>>
>> This is a problem that has bitten many people who have tried to use
>> capabilities for anything useful.
>>
>> ===== The proposed change =====
>>
>> This patch adds a fifth capability mask called the ambient mask
>> (pA).  pA does what pI should have done.
>>
>> pA obeys the invariant that no bit can ever be set in pA if it is
>> not set in both pP and pI.  Dropping a bit from pP or pI drops that
>> bit from pA.  This ensures that existing programs that try to drop
>> capabilities still do so, with a complication.  Because capability
>> inheritance is so broken, setting KEEPCAPS, using setresuid to
>> switch to nonroot uids, and calling execve effectively drops
>> capabilities.  Therefore, setresuid from root to nonroot
>> unconditionally clears pA.  Processes that don't like this can
>> re-add bits to pA afterwards.
>>
>> The capability evolution rules are changed:
>>
>>   pA' = (file caps or setuid or setgid ? 0 : pA)
>>   pP' = (X & fP) | (pI & fI) | pA'
>>   pI' = pI
>>   pE' = (fE ? pP' : pA')
>>   X is unchanged
>>
>> If you are nonroot but you have a capability, you can add it to pA.
>> If you do so, your children get that capability in pA, pP, and pE.
>> For example, you can set pA = CAP_NET_BIND_SERVICE, and your
>> children can automatically bind low-numbered ports.  Hallelujah!
>>
>> Unprivileged users can create user namespaces, map themselves to a
>> nonzero uid, and create both privileged (relative to their
>> namespace) and unprivileged process trees.  This is currently more
>> or less impossible.  Hallelujah!
>>
>> You cannot use pA to try to subvert a setuid, setgid, or file-capped
>> program: if you execute any such program, pA gets cleared and the
>> resulting evolution rules are unchanged by this patch.
>>
>> Users with nonzero pA are unlikely to unintentionally leak that
>> capability.  If they run programs that try to drop privileges,
>> dropping privileges will still work.
>>
>> It's worth noting that the degree of paranoia in this patch could
>> possibly be relaxed without causing serious problems.  Specifically,
>> if we allowed pA to persist across executing non-pA-aware setuid
>> binaries and across setresuid, then, naively, the only capabilities
>> that could leak as a result would be the capabilities in pA, and any
>> attacker *already* has those capabilities.  This would make me
>> nervous, though -- setuid binaries that tried to privilege-separate
>> might fail to do so, and putting CAP_DAC_READ_SEARCH or
>> CAP_DAC_OVERRIDE into pA could have unexpected side effects.
>> (Whether these unexpected side effects would be exploitable is an
>> open question.)  I've therefore taken the more paranoid route.
>>
>> An alternative would be to either require PR_SET_NO_NEW_PRIVS before
>> setting ambient capabilities.  I think that this would be annoying
>> and would make granting otherwise unprivileged users minor ambient
>> capabilities (CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much
>> less useful than it is with this patch.
>>
>> ===== Footnotes =====
>>
>> [1] Files that are missing the "security.capability" xattr or that
>> have unrecognized values for that xattr end up with has_cap ==
>> false.  The code that does that appears to be complicated for no
>> good reason.
>>
>> [2] The libcap capability mask parsers and formatters are
>> dangerously misleading and the documentation is flat-out wrong.  fE
>> is *not* a mask; it's a single bit.  This has probably confused
>> every single person who has tried to use file capabilities.
>>
>> [3] Linux very confusingly processes the script and the interpreter if
>> applicable, for reasons that escape me.  The results from thinking
>> about a script's file capabilities and/or setuid bits are mostly discarded.
>>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>> Cc: Andy Lutomirski <luto@amacapital.net>
>> Cc: Jonathan Corbet <corbet@lwn.net>
>> Cc: Aaron Jones <aaronmdjones@gmail.com>
>> CC: Ted Ts'o <tytso@mit.edu>
>> Cc: linux-security-module@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: linux-api@vger.kernel.org
>> Cc: akpm@linuxfoundation.org
>> Cc: Andrew G. Morgan <morgan@kernel.org>
>> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
>> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
>> Cc: Markku Savela <msa@moth.iki.fi>
>> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
>> Cc: Michael Kerrisk <mtk.manpages@gmail.com>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>
> This would be quite welcome for things we're doing in Chrome OS.
> Presently, we're able to use fscaps to keep non-root caps across exec
> and haven't encountered issues with AT_SECURE (yet), but using pA
> would be much nicer and exactly matches how we want to use it: a
> launcher is creating a tree of processes that are non-root but need
> some capabilities. Right now the tree is very small and we're able to
> sprinkle our fscaps lightly. :) This would be better.
>
> -Kees
>
>> ---
>>
>> Preliminary userspace code is here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=860c73ac1acaaae976bdd3bb83b89b0180f0702a
>>
>> fs/proc/array.c              |  5 ++-
>>  include/linux/cred.h         | 15 +++++++++
>>  include/uapi/linux/prctl.h   |  6 ++++
>>  kernel/user_namespace.c      |  1 +
>>  security/commoncap.c         | 75 ++++++++++++++++++++++++++++++++++++++------
>>  security/keys/process_keys.c |  1 +
>>  6 files changed, 92 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/proc/array.c b/fs/proc/array.c
>> index 1295a00ca316..bc15356d6551 100644
>> --- a/fs/proc/array.c
>> +++ b/fs/proc/array.c
>> @@ -282,7 +282,8 @@ static void render_cap_t(struct seq_file *m, const char *header,
>>  static inline void task_cap(struct seq_file *m, struct task_struct *p)
>>  {
>>         const struct cred *cred;
>> -       kernel_cap_t cap_inheritable, cap_permitted, cap_effective, cap_bset;
>> +       kernel_cap_t cap_inheritable, cap_permitted, cap_effective,
>> +                       cap_bset, cap_ambient;
>>
>>         rcu_read_lock();
>>         cred = __task_cred(p);
>> @@ -290,12 +291,14 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
>>         cap_permitted   = cred->cap_permitted;
>>         cap_effective   = cred->cap_effective;
>>         cap_bset        = cred->cap_bset;
>> +       cap_ambient     = cred->cap_ambient;
>>         rcu_read_unlock();
>>
>>         render_cap_t(m, "CapInh:\t", &cap_inheritable);
>>         render_cap_t(m, "CapPrm:\t", &cap_permitted);
>>         render_cap_t(m, "CapEff:\t", &cap_effective);
>>         render_cap_t(m, "CapBnd:\t", &cap_bset);
>> +       render_cap_t(m, "CapAmb:\t", &cap_ambient);
>>  }
>>
>>  static inline void task_seccomp(struct seq_file *m, struct task_struct *p)
>> diff --git a/include/linux/cred.h b/include/linux/cred.h
>> index 2fb2ca2127ed..a21bcba6ef84 100644
>> --- a/include/linux/cred.h
>> +++ b/include/linux/cred.h
>> @@ -122,6 +122,7 @@ struct cred {
>>         kernel_cap_t    cap_permitted;  /* caps we're permitted */
>>         kernel_cap_t    cap_effective;  /* caps we can actually use */
>>         kernel_cap_t    cap_bset;       /* capability bounding set */
>> +       kernel_cap_t    cap_ambient;    /* Ambient capability set */
>>  #ifdef CONFIG_KEYS
>>         unsigned char   jit_keyring;    /* default keyring to attach requested
>>                                          * keys to */
>> @@ -197,6 +198,20 @@ static inline void validate_process_creds(void)
>>  }
>>  #endif
>>
>> +static inline void cap_enforce_ambient_invariants(struct cred *cred)
>> +{
>> +       cred->cap_ambient = cap_intersect(cred->cap_ambient,
>> +                                         cap_intersect(cred->cap_permitted,
>> +                                                       cred->cap_inheritable));
>> +}
>> +
>> +static inline bool cap_ambient_invariant_ok(const struct cred *cred)
>> +{
>> +       return cap_issubset(cred->cap_ambient,
>> +                           cap_intersect(cred->cap_permitted,
>> +                                         cred->cap_inheritable));
>> +}
>> +
>>  /**
>>   * get_new_cred - Get a reference on a new set of credentials
>>   * @cred: The new credentials to reference
>> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
>> index 31891d9535e2..65407f867e82 100644
>> --- a/include/uapi/linux/prctl.h
>> +++ b/include/uapi/linux/prctl.h
>> @@ -190,4 +190,10 @@ struct prctl_mm_map {
>>  # define PR_FP_MODE_FR         (1 << 0)        /* 64b FP registers */
>>  # define PR_FP_MODE_FRE                (1 << 1)        /* 32b compatibility */
>>
>> +/* Control the ambient capability set */
>> +#define PR_CAP_AMBIENT         47
>> +# define PR_CAP_AMBIENT_GET    1
>> +# define PR_CAP_AMBIENT_RAISE  2
>> +# define PR_CAP_AMBIENT_LOWER  3
>> +
>>  #endif /* _LINUX_PRCTL_H */
>> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
>> index 4109f8320684..dab0f808235a 100644
>> --- a/kernel/user_namespace.c
>> +++ b/kernel/user_namespace.c
>> @@ -39,6 +39,7 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
>>         cred->cap_inheritable = CAP_EMPTY_SET;
>>         cred->cap_permitted = CAP_FULL_SET;
>>         cred->cap_effective = CAP_FULL_SET;
>> +       cred->cap_ambient = CAP_EMPTY_SET;
>>         cred->cap_bset = CAP_FULL_SET;
>>  #ifdef CONFIG_KEYS
>>         key_put(cred->request_key_auth);
>> diff --git a/security/commoncap.c b/security/commoncap.c
>> index f66713bd7450..b3253886ecad 100644
>> --- a/security/commoncap.c
>> +++ b/security/commoncap.c
>> @@ -272,6 +272,7 @@ int cap_capset(struct cred *new,
>>         new->cap_effective   = *effective;
>>         new->cap_inheritable = *inheritable;
>>         new->cap_permitted   = *permitted;
>> +       cap_enforce_ambient_invariants(new);
>>         return 0;
>>  }
>>
>> @@ -352,6 +353,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps,
>>
>>                 /*
>>                  * pP' = (X & fP) | (pI & fI)
>> +                * The addition of pA' is handled later.
>>                  */
>>                 new->cap_permitted.cap[i] =
>>                         (new->cap_bset.cap[i] & permitted) |
>> @@ -479,10 +481,12 @@ int cap_bprm_set_creds(struct linux_binprm *bprm)
>>  {
>>         const struct cred *old = current_cred();
>>         struct cred *new = bprm->cred;
>> -       bool effective, has_cap = false;
>> +       bool effective, has_cap = false, is_setid;
>>         int ret;
>>         kuid_t root_uid;
>>
>> +       BUG_ON(!cap_ambient_invariant_ok(old));
>> +
>>         effective = false;
>>         ret = get_file_caps(bprm, &effective, &has_cap);
>>         if (ret < 0)
>> @@ -527,8 +531,9 @@ skip:
>>          *
>>          * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
>>          */
>> -       if ((!uid_eq(new->euid, old->uid) ||
>> -            !gid_eq(new->egid, old->gid) ||
>> +       is_setid = !uid_eq(new->euid, old->uid) || !gid_eq(new->egid, old->gid);
>> +
>> +       if ((is_setid ||
>>              !cap_issubset(new->cap_permitted, old->cap_permitted)) &&
>>             bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) {
>>                 /* downgrade; they get no more than they had, and maybe less */
>> @@ -544,10 +549,23 @@ skip:
>>         new->suid = new->fsuid = new->euid;
>>         new->sgid = new->fsgid = new->egid;
>>
>> +       /* File caps or setid cancel ambient. */
>> +       if (has_cap || is_setid)
>> +               cap_clear(new->cap_ambient);
>> +
>> +       /*
>> +        * Now that we've computed pA', update pP' to give:
>> +        *   pP' = (X & fP) | (pI & fI) | pA'
>> +        */
>> +       new->cap_permitted = cap_combine(new->cap_permitted, new->cap_ambient);
>> +
>>         if (effective)
>>                 new->cap_effective = new->cap_permitted;
>>         else
>> -               cap_clear(new->cap_effective);
>> +               new->cap_effective = new->cap_ambient;
>> +
>> +       BUG_ON(!cap_ambient_invariant_ok(new));
>> +
>>         bprm->cap_effective = effective;
>>
>>         /*
>> @@ -562,7 +580,7 @@ skip:
>>          * Number 1 above might fail if you don't have a full bset, but I think
>>          * that is interesting information to audit.
>>          */
>> -       if (!cap_isclear(new->cap_effective)) {
>> +       if (!cap_issubset(new->cap_effective, new->cap_ambient)) {
>>                 if (!cap_issubset(CAP_FULL_SET, new->cap_effective) ||
>>                     !uid_eq(new->euid, root_uid) || !uid_eq(new->uid, root_uid) ||
>>                     issecure(SECURE_NOROOT)) {
>> @@ -573,6 +591,9 @@ skip:
>>         }
>>
>>         new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
>> +
>> +       BUG_ON(!cap_ambient_invariant_ok(new));
>> +
>>         return 0;
>>  }
>>
>> @@ -594,7 +615,7 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
>>         if (!uid_eq(cred->uid, root_uid)) {
>>                 if (bprm->cap_effective)
>>                         return 1;
>> -               if (!cap_isclear(cred->cap_permitted))
>> +               if (!cap_issubset(cred->cap_permitted, cred->cap_ambient))
>>                         return 1;
>>         }
>>
>> @@ -696,10 +717,18 @@ static inline void cap_emulate_setxuid(struct cred *new, const struct cred *old)
>>              uid_eq(old->suid, root_uid)) &&
>>             (!uid_eq(new->uid, root_uid) &&
>>              !uid_eq(new->euid, root_uid) &&
>> -            !uid_eq(new->suid, root_uid)) &&
>> -           !issecure(SECURE_KEEP_CAPS)) {
>> -               cap_clear(new->cap_permitted);
>> -               cap_clear(new->cap_effective);
>> +            !uid_eq(new->suid, root_uid))) {
>> +               if (!issecure(SECURE_KEEP_CAPS)) {
>> +                       cap_clear(new->cap_permitted);
>> +                       cap_clear(new->cap_effective);
>> +               }
>> +
>> +               /*
>> +                * Pre-ambient programs except setresuid to nonroot followed
>> +                * by exec to drop capabilities.  We should make sure that
>> +                * this remains the case.
>> +                */
>> +               cap_clear(new->cap_ambient);
>>         }
>>         if (uid_eq(old->euid, root_uid) && !uid_eq(new->euid, root_uid))
>>                 cap_clear(new->cap_effective);
>> @@ -929,6 +958,32 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>>                         new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS);
>>                 return commit_creds(new);
>>
>> +       case PR_CAP_AMBIENT:
>> +               if (((!cap_valid(arg3)) | arg4 | arg5))
>> +                       return -EINVAL;
>> +
>> +               if (arg2 == PR_CAP_AMBIENT_GET) {
>> +                       return !!cap_raised(current_cred()->cap_ambient, arg3);
>> +               } else if (arg2 != PR_CAP_AMBIENT_RAISE &&
>> +                          arg2 != PR_CAP_AMBIENT_LOWER) {
>> +                       return -EINVAL;
>> +               } else {
>> +                       if (arg2 == PR_CAP_AMBIENT_RAISE &&
>> +                           (!cap_raised(current_cred()->cap_permitted, arg3) ||
>> +                            !cap_raised(current_cred()->cap_inheritable,
>> +                                        arg3)))
>> +                               return -EPERM;
>> +
>> +                       new = prepare_creds();
>> +                       if (!new)
>> +                               return -ENOMEM;
>> +                       if (arg2 == PR_CAP_AMBIENT_RAISE)
>> +                               cap_raise(new->cap_ambient, arg3);
>> +                       else
>> +                               cap_lower(new->cap_ambient, arg3);
>> +                       return commit_creds(new);
>> +               }
>> +
>>         default:
>>                 /* No functionality available - continue with default */
>>                 return -ENOSYS;
>> diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
>> index bd536cb221e2..43b4cddbf2b3 100644
>> --- a/security/keys/process_keys.c
>> +++ b/security/keys/process_keys.c
>> @@ -848,6 +848,7 @@ void key_change_session_keyring(struct callback_head *twork)
>>         new->cap_inheritable    = old->cap_inheritable;
>>         new->cap_permitted      = old->cap_permitted;
>>         new->cap_effective      = old->cap_effective;
>> +       new->cap_ambient        = old->cap_ambient;
>>         new->cap_bset           = old->cap_bset;
>>
>>         new->jit_keyring        = old->jit_keyring;
>> --
>> 2.3.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
>
>
> --
> Kees Cook
> Chrome OS Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-12 22:27       ` Andrew Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew Lutomirski @ 2015-03-12 22:27 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Thu, Mar 12, 2015 at 3:10 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
> I'm unclear why you refer to the inheritable set in this test:
>
> +               } else {
> +                       if (arg2 == PR_CAP_AMBIENT_RAISE &&
> +                           (!cap_raised(current_cred()->cap_permitted, arg3) ||
> +                            !cap_raised(current_cred()->cap_inheritable,
> +                                        arg3)))
> +                               return -EPERM;

It's to preserve the invariant that pA is always a subset of pI.

>
> I'm also unclear how you can turn off this new 'feature' for a process
> tree? As it is, the code creates an exploit path for a capable (pP !=
> 0) program with an exploitable flaw to create a privilege escalation
> for an arbitrary child program.

Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
injects shellcode that does system("/bin/bash") into some pP != 0
program, they don't actually elevate their privileges.  On the other
hand, by the time an attacker injected shellcode for:

prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
system("/bin/bash");

into a target, they can already do whatever they want.

> While I understand that everyone
> 'knows what they are doing' in implementing this change, I'm convinced
> that folk that are up to no good also do... Why not provide a lockable
> secure bit to selectively disable this support?

Show me a legitimate use case and I'll gladly implement a secure bit.
In the mean time, I don't even believe that there's a legitimate use
for any of the other secure bits (except keepcaps, and I don't know
why that's a securebit in the first place).

In the mean time, see CVE-2014-3215 for an example of why securebits
are probably more trouble than they're worth.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-12 22:27       ` Andrew Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew Lutomirski @ 2015-03-12 22:27 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Thu, Mar 12, 2015 at 3:10 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> I'm unclear why you refer to the inheritable set in this test:
>
> +               } else {
> +                       if (arg2 == PR_CAP_AMBIENT_RAISE &&
> +                           (!cap_raised(current_cred()->cap_permitted, arg3) ||
> +                            !cap_raised(current_cred()->cap_inheritable,
> +                                        arg3)))
> +                               return -EPERM;

It's to preserve the invariant that pA is always a subset of pI.

>
> I'm also unclear how you can turn off this new 'feature' for a process
> tree? As it is, the code creates an exploit path for a capable (pP !=
> 0) program with an exploitable flaw to create a privilege escalation
> for an arbitrary child program.

Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
injects shellcode that does system("/bin/bash") into some pP != 0
program, they don't actually elevate their privileges.  On the other
hand, by the time an attacker injected shellcode for:

prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
system("/bin/bash");

into a target, they can already do whatever they want.

> While I understand that everyone
> 'knows what they are doing' in implementing this change, I'm convinced
> that folk that are up to no good also do... Why not provide a lockable
> secure bit to selectively disable this support?

Show me a legitimate use case and I'll gladly implement a secure bit.
In the mean time, I don't even believe that there's a legitimate use
for any of the other secure bits (except keepcaps, and I don't know
why that's a securebit in the first place).

In the mean time, see CVE-2014-3215 for an example of why securebits
are probably more trouble than they're worth.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 13:24         ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-13 13:24 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

> It's to preserve the invariant that pA is always a subset of pI.

But since a user can always raise a bit in pI if it is present in pP,
what does this invariant add to your model other than inconvenience?

>> I'm also unclear how you can turn off this new 'feature' for a process
>> tree? As it is, the code creates an exploit path for a capable (pP !=
>> 0) program with an exploitable flaw to create a privilege escalation
>> for an arbitrary child program.
>
> Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
> injects shellcode that does system("/bin/bash") into some pP != 0
> program, they don't actually elevate their privileges.  On the other
> hand, by the time an attacker injected shellcode for:
>
> prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
> system("/bin/bash");

Let's call the above two lines [a] and [b]. With this patch, you are
encouraging folk to write programs that contain a line like [a]
already. So, yes, I am saying that you are creating an exploitable
path in these programs that says if someone can inject
system("/bin/bash") into these programs they can get a new (because of
this patch) privilege escalation.

In the prevailing model, this kind of privilege escalation (resulting
from naive inheritance) is designed out. I recognize that you want to
add it back in, but I am concerned that you are not allowing for the
possibility that some folk might still want to still be able to
prevent it.

> into a target, they can already do whatever they want.
>
>> While I understand that everyone
>> 'knows what they are doing' in implementing this change, I'm convinced
>> that folk that are up to no good also do... Why not provide a lockable
>> secure bit to selectively disable this support?
>
> Show me a legitimate use case and I'll gladly implement a secure bit.

Thanks. I was kind of hoping that you would add a lockable secure bit
that defaults this support to off, but can be used to turn it on with
or without a lock. That way, you can avoid disturbing the legacy
defaults (no surprises).

> In the mean time, I don't even believe that there's a legitimate use
> for any of the other secure bits (except keepcaps, and I don't know
> why that's a securebit in the first place).

Those bits currently make it possible to run a subsystem with no
[set]uid-0 support in its process tree.

> In the mean time, see CVE-2014-3215 for an example of why securebits
> are probably more trouble than they're worth.

I think it is safe to say that naive privilege inheritance has a fair
track record of being exploited orders of magnitude more frequently
than this. After all, these are the reasons LD_PRELOAD and shell
script setuid bits are suppressed.

Cheers

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 13:24         ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-13 13:24 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: Kees Cook, Christoph Lameter, Serge Hallyn, Andy Lutomirski,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

> It's to preserve the invariant that pA is always a subset of pI.

But since a user can always raise a bit in pI if it is present in pP,
what does this invariant add to your model other than inconvenience?

>> I'm also unclear how you can turn off this new 'feature' for a process
>> tree? As it is, the code creates an exploit path for a capable (pP !=
>> 0) program with an exploitable flaw to create a privilege escalation
>> for an arbitrary child program.
>
> Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
> injects shellcode that does system("/bin/bash") into some pP != 0
> program, they don't actually elevate their privileges.  On the other
> hand, by the time an attacker injected shellcode for:
>
> prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
> system("/bin/bash");

Let's call the above two lines [a] and [b]. With this patch, you are
encouraging folk to write programs that contain a line like [a]
already. So, yes, I am saying that you are creating an exploitable
path in these programs that says if someone can inject
system("/bin/bash") into these programs they can get a new (because of
this patch) privilege escalation.

In the prevailing model, this kind of privilege escalation (resulting
from naive inheritance) is designed out. I recognize that you want to
add it back in, but I am concerned that you are not allowing for the
possibility that some folk might still want to still be able to
prevent it.

> into a target, they can already do whatever they want.
>
>> While I understand that everyone
>> 'knows what they are doing' in implementing this change, I'm convinced
>> that folk that are up to no good also do... Why not provide a lockable
>> secure bit to selectively disable this support?
>
> Show me a legitimate use case and I'll gladly implement a secure bit.

Thanks. I was kind of hoping that you would add a lockable secure bit
that defaults this support to off, but can be used to turn it on with
or without a lock. That way, you can avoid disturbing the legacy
defaults (no surprises).

> In the mean time, I don't even believe that there's a legitimate use
> for any of the other secure bits (except keepcaps, and I don't know
> why that's a securebit in the first place).

Those bits currently make it possible to run a subsystem with no
[set]uid-0 support in its process tree.

> In the mean time, see CVE-2014-3215 for an example of why securebits
> are probably more trouble than they're worth.

I think it is safe to say that naive privilege inheritance has a fair
track record of being exploited orders of magnitude more frequently
than this. After all, these are the reasons LD_PRELOAD and shell
script setuid bits are suppressed.

Cheers

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-13 13:24         ` Andrew G. Morgan
  (?)
@ 2015-03-13 16:06         ` Christoph Lameter
  2015-03-13 17:58             ` Andy Lutomirski
  -1 siblings, 1 reply; 55+ messages in thread
From: Christoph Lameter @ 2015-03-13 16:06 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Andrew Lutomirski, Kees Cook, Serge Hallyn, Andy Lutomirski,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Fri, 13 Mar 2015, Andrew G. Morgan wrote:

> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
> > system("/bin/bash");
>
> Let's call the above two lines [a] and [b]. With this patch, you are
> encouraging folk to write programs that contain a line like [a]
> already. So, yes, I am saying that you are creating an exploitable
> path in these programs that says if someone can inject
> system("/bin/bash") into these programs they can get a new (because of
> this patch) privilege escalation.

Well this is what one naively expects capabilities to give you. An ability
to avoid full superuser binaries by segmenting off capabilities. Often
there is really no other choice. If you do not provide this mode then
the system will be even less secure since people run stuff as root.

This looks to many like the design of capabilties is inherent flawed since
it does not give you what you need. You experience a go around that leads
nowhere and just wastes your time.

> In the prevailing model, this kind of privilege escalation (resulting
> from naive inheritance) is designed out. I recognize that you want to
> add it back in, but I am concerned that you are not allowing for the
> possibility that some folk might still want to still be able to
> prevent it.

The functionalty here depends on CAP_SETPCAP. That was intended as some
point to be off by default? You can have distros/kernels with that being
off.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-13 13:24         ` Andrew G. Morgan
  (?)
  (?)
@ 2015-03-13 17:57         ` Andy Lutomirski
  2015-03-13 18:52             ` Kees Cook
  2015-03-14 21:09             ` Andrew G. Morgan
  -1 siblings, 2 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-13 17:57 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote:
>
> > It's to preserve the invariant that pA is always a subset of pI.
>
> But since a user can always raise a bit in pI if it is present in pP,
> what does this invariant add to your model other than inconvenience?

The useful part is that dropping a bit from pI also drops it from pA.
To keep the model consistent, I also require that you add the bit to
pI before adding it to pA.

>
> >> I'm also unclear how you can turn off this new 'feature' for a process
> >> tree? As it is, the code creates an exploit path for a capable (pP !=
> >> 0) program with an exploitable flaw to create a privilege escalation
> >> for an arbitrary child program.
> >
> > Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
> > injects shellcode that does system("/bin/bash") into some pP != 0
> > program, they don't actually elevate their privileges.  On the other
> > hand, by the time an attacker injected shellcode for:
> >
> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
> > system("/bin/bash");
>
> Let's call the above two lines [a] and [b]. With this patch, you are
> encouraging folk to write programs that contain a line like [a]
> already. So, yes, I am saying that you are creating an exploitable
> path in these programs that says if someone can inject
> system("/bin/bash") into these programs they can get a new (because of
> this patch) privilege escalation.
>
> In the prevailing model, this kind of privilege escalation (resulting
> from naive inheritance) is designed out. I recognize that you want to
> add it back in, but I am concerned that you are not allowing for the
> possibility that some folk might still want to still be able to
> prevent it.

If you have a program that deliberately uses PR_CAP_AMBIENT, then
setting such a securebit will break the program, so it still doesn't
buy you anything.

>
> > into a target, they can already do whatever they want.
> >
> >> While I understand that everyone
> >> 'knows what they are doing' in implementing this change, I'm convinced
> >> that folk that are up to no good also do... Why not provide a lockable
> >> secure bit to selectively disable this support?
> >
> > Show me a legitimate use case and I'll gladly implement a secure bit.
>
> Thanks. I was kind of hoping that you would add a lockable secure bit
> that defaults this support to off, but can be used to turn it on with
> or without a lock. That way, you can avoid disturbing the legacy
> defaults (no surprises).

I think this thing needs to default on to be useful.

>
> > In the mean time, I don't even believe that there's a legitimate use
> > for any of the other secure bits (except keepcaps, and I don't know
> > why that's a securebit in the first place).
>
> Those bits currently make it possible to run a subsystem with no
> [set]uid-0 support in its process tree.

Not usefully, because even with all the securebits set to their
non-legacy modes, caps don't inherit, so it doesn't work.  I've tried.

>
> > In the mean time, see CVE-2014-3215 for an example of why securebits
> > are probably more trouble than they're worth.
>
> I think it is safe to say that naive privilege inheritance has a fair
> track record of being exploited orders of magnitude more frequently
> than this. After all, these are the reasons LD_PRELOAD and shell
> script setuid bits are suppressed.

I don't know what you mean here by naive privilege inheritance.  The
examples you're taking about aren't inheritance at all; they're
exploring privilege *grants* during execve.  My patch deliberately
leaves grants like that alone.

--Andy

>
> Cheers
>
> Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 17:58             ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-13 17:58 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew G. Morgan, Andrew Lutomirski, Kees Cook, Serge Hallyn,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Fri, Mar 13, 2015 at 9:06 AM, Christoph Lameter <cl@linux.com> wrote:
> On Fri, 13 Mar 2015, Andrew G. Morgan wrote:
>
>> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
>> > system("/bin/bash");
>>
>> Let's call the above two lines [a] and [b]. With this patch, you are
>> encouraging folk to write programs that contain a line like [a]
>> already. So, yes, I am saying that you are creating an exploitable
>> path in these programs that says if someone can inject
>> system("/bin/bash") into these programs they can get a new (because of
>> this patch) privilege escalation.
>
> Well this is what one naively expects capabilities to give you. An ability
> to avoid full superuser binaries by segmenting off capabilities. Often
> there is really no other choice. If you do not provide this mode then
> the system will be even less secure since people run stuff as root.
>
> This looks to many like the design of capabilties is inherent flawed since
> it does not give you what you need. You experience a go around that leads
> nowhere and just wastes your time.
>
>> In the prevailing model, this kind of privilege escalation (resulting
>> from naive inheritance) is designed out. I recognize that you want to
>> add it back in, but I am concerned that you are not allowing for the
>> possibility that some folk might still want to still be able to
>> prevent it.
>
> The functionalty here depends on CAP_SETPCAP. That was intended as some
> point to be off by default? You can have distros/kernels with that being
> off.

Not in my version.  I don't want to further encourage people to hand
out CAP_SETPCAP.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 17:58             ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-13 17:58 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew G. Morgan, Andrew Lutomirski, Kees Cook, Serge Hallyn,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Fri, Mar 13, 2015 at 9:06 AM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, 13 Mar 2015, Andrew G. Morgan wrote:
>
>> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
>> > system("/bin/bash");
>>
>> Let's call the above two lines [a] and [b]. With this patch, you are
>> encouraging folk to write programs that contain a line like [a]
>> already. So, yes, I am saying that you are creating an exploitable
>> path in these programs that says if someone can inject
>> system("/bin/bash") into these programs they can get a new (because of
>> this patch) privilege escalation.
>
> Well this is what one naively expects capabilities to give you. An ability
> to avoid full superuser binaries by segmenting off capabilities. Often
> there is really no other choice. If you do not provide this mode then
> the system will be even less secure since people run stuff as root.
>
> This looks to many like the design of capabilties is inherent flawed since
> it does not give you what you need. You experience a go around that leads
> nowhere and just wastes your time.
>
>> In the prevailing model, this kind of privilege escalation (resulting
>> from naive inheritance) is designed out. I recognize that you want to
>> add it back in, but I am concerned that you are not allowing for the
>> possibility that some folk might still want to still be able to
>> prevent it.
>
> The functionalty here depends on CAP_SETPCAP. That was intended as some
> point to be off by default? You can have distros/kernels with that being
> off.

Not in my version.  I don't want to further encourage people to hand
out CAP_SETPCAP.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 18:52               ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-03-13 18:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew G. Morgan, Andrew Lutomirski, Kees Cook, Serge Hallyn,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Fri, 13 Mar 2015, Andy Lutomirski wrote:

> > The functionalty here depends on CAP_SETPCAP. That was intended as some
> > point to be off by default? You can have distros/kernels with that being
> > off.
>
> Not in my version.  I don't want to further encourage people to hand
> out CAP_SETPCAP.

Owww... But if you leave it in then maybe Andrew will be satisfied with
this approach?


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 18:52               ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-03-13 18:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew G. Morgan, Andrew Lutomirski, Kees Cook, Serge Hallyn,
	Jonathan Corbet, Aaron Jones, Ted Ts'o,
	linux-security-module, LKML, Linux API, Andrew Morton,
	Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
	Michael Kerrisk

On Fri, 13 Mar 2015, Andy Lutomirski wrote:

> > The functionalty here depends on CAP_SETPCAP. That was intended as some
> > point to be off by default? You can have distros/kernels with that being
> > off.
>
> Not in my version.  I don't want to further encourage people to hand
> out CAP_SETPCAP.

Owww... But if you leave it in then maybe Andrew will be satisfied with
this approach?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 18:52             ` Kees Cook
  0 siblings, 0 replies; 55+ messages in thread
From: Kees Cook @ 2015-03-13 18:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o,
	Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar,
	Linux API, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Christoph Lameter, LKML, Serge Hallyn,
	Markku Savela, Jonathan Corbet

On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote:
>> I think it is safe to say that naive privilege inheritance has a fair
>> track record of being exploited orders of magnitude more frequently
>> than this. After all, these are the reasons LD_PRELOAD and shell
>> script setuid bits are suppressed.
>
> I don't know what you mean here by naive privilege inheritance.  The
> examples you're taking about aren't inheritance at all; they're
> exploring privilege *grants* during execve.  My patch deliberately
> leaves grants like that alone.

Just to clarify (at least for myself), I think the issue here is the
perception of privileges leaking into newly execed processes that may
have flaws that allow arbitrary execution control.

Case A: daemon is running (with whatever set of privileges), has a
flaw and an attacker now has those same privileges.

Case B: running as root, runs some child (which stays root) with a
flaw and an attacker now has root privileges.

Case C: running with pE=CAP_NET_ADMIN, runs some child (which loses
the cap) with a flaw and an attacker now has only user privileges.

Case D: running with pE=pI=CAP_NET_ADMIN, runs some child that has
fI=CAP_NET_ADMIN, but other children don't and lose the cap as in case
C. The privileged attack surface is reduced to only the child with fI
set.

Case E: running with CAP_NET_ADMIN in pA, runs some child (which gets
pE=pA=CAP_NET_ADMIN), and has an attack surface larger (all children)
than Case D, but with a scope smaller (only CAP_NET_ADMIN) than Case
B.

We don't need to talk about Case A, since we're not crossing an exec
boundary and the attacker already has execution control. We all lose.

Case B is the poster-child for "don't run daemons as root", since
privileges aren't dropped, and we're basically back to Case A again.

Case C is the classic "just give the daemon what caps it needs" case,
and if the daemon itself isn't flawed (Case A), then the privileges
don't leak out to any children (usually helpers of some kind) since
the cap doesn't cross the exec boundary. This lack of leaking is
really frustrating for daemons that need to give caps to helpers (e.g.
containers).

Case D is the standard solution to the frustrations in Case C, but
requires filesystem capabilities (and/or SELinux) to pass
capabilities, and comes with various limitations as described in
Andy's first email.

Case E would be possible with Andy's patch. It lacks the limitations
and frustrations of C and D, but opens us up to a limited version of
the risks in Case B. So, I don't think it's as "bad" as Case B since a
process must opt into it (by setting pA), and it passes only the
limited set of capabilities.

I think this boils down to accepting the elevated risk while
recognizing that it may be less than Case B but greater than Case C.

All this said, almost half of the capabilities, if passed to flawed
children with attacker controlled execution, can be elevated to full
root privileges pretty easily[1], so I think any documentation around
this feature should include some pretty dire warnings about using
this.

-Kees

[1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 18:52             ` Kees Cook
  0 siblings, 0 replies; 55+ messages in thread
From: Kees Cook @ 2015-03-13 18:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o,
	Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar,
	Linux API, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Christoph Lameter, LKML, Serge Hallyn,
	Markku Savela, Jonathan Corbet

On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>> I think it is safe to say that naive privilege inheritance has a fair
>> track record of being exploited orders of magnitude more frequently
>> than this. After all, these are the reasons LD_PRELOAD and shell
>> script setuid bits are suppressed.
>
> I don't know what you mean here by naive privilege inheritance.  The
> examples you're taking about aren't inheritance at all; they're
> exploring privilege *grants* during execve.  My patch deliberately
> leaves grants like that alone.

Just to clarify (at least for myself), I think the issue here is the
perception of privileges leaking into newly execed processes that may
have flaws that allow arbitrary execution control.

Case A: daemon is running (with whatever set of privileges), has a
flaw and an attacker now has those same privileges.

Case B: running as root, runs some child (which stays root) with a
flaw and an attacker now has root privileges.

Case C: running with pE=CAP_NET_ADMIN, runs some child (which loses
the cap) with a flaw and an attacker now has only user privileges.

Case D: running with pE=pI=CAP_NET_ADMIN, runs some child that has
fI=CAP_NET_ADMIN, but other children don't and lose the cap as in case
C. The privileged attack surface is reduced to only the child with fI
set.

Case E: running with CAP_NET_ADMIN in pA, runs some child (which gets
pE=pA=CAP_NET_ADMIN), and has an attack surface larger (all children)
than Case D, but with a scope smaller (only CAP_NET_ADMIN) than Case
B.

We don't need to talk about Case A, since we're not crossing an exec
boundary and the attacker already has execution control. We all lose.

Case B is the poster-child for "don't run daemons as root", since
privileges aren't dropped, and we're basically back to Case A again.

Case C is the classic "just give the daemon what caps it needs" case,
and if the daemon itself isn't flawed (Case A), then the privileges
don't leak out to any children (usually helpers of some kind) since
the cap doesn't cross the exec boundary. This lack of leaking is
really frustrating for daemons that need to give caps to helpers (e.g.
containers).

Case D is the standard solution to the frustrations in Case C, but
requires filesystem capabilities (and/or SELinux) to pass
capabilities, and comes with various limitations as described in
Andy's first email.

Case E would be possible with Andy's patch. It lacks the limitations
and frustrations of C and D, but opens us up to a limited version of
the risks in Case B. So, I don't think it's as "bad" as Case B since a
process must opt into it (by setting pA), and it passes only the
limited set of capabilities.

I think this boils down to accepting the elevated risk while
recognizing that it may be less than Case B but greater than Case C.

All this said, almost half of the capabilities, if passed to flawed
children with attacker controlled execution, can be elevated to full
root privileges pretty easily[1], so I think any documentation around
this feature should include some pretty dire warnings about using
this.

-Kees

[1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-13 18:52             ` Kees Cook
  (?)
@ 2015-03-13 19:03             ` Andy Lutomirski
  2015-03-13 19:54                 ` Kees Cook
  -1 siblings, 1 reply; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-13 19:03 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o,
	Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar,
	Linux API, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Christoph Lameter, LKML, Serge Hallyn,
	Markku Savela, Jonathan Corbet

On Fri, Mar 13, 2015 at 11:52 AM, Kees Cook <keescook@chromium.org> wrote:
>
> All this said, almost half of the capabilities, if passed to flawed
> children with attacker controlled execution, can be elevated to full
> root privileges pretty easily[1], so I think any documentation around
> this feature should include some pretty dire warnings about using
> this.

That's a good point.  I'll make sure to document that.

It's worth noting that, for many applications, that list is
overstated.  For example, many of the suggested privilege escalations
don't work if you're in a sufficiently restrictive mount namespace.

For my own use, I plan on adding only CAP_NET_BIND_SERVICE and
CAP_NET_RAW to pA, and I'll be layering seccomp on top to the extent
possible.

--Andy

>
> -Kees
>
> [1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522
>
> --
> Kees Cook
> Chrome OS Security



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 19:54                 ` Kees Cook
  0 siblings, 0 replies; 55+ messages in thread
From: Kees Cook @ 2015-03-13 19:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o,
	Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar,
	Linux API, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Christoph Lameter, LKML, Serge Hallyn,
	Markku Savela, Jonathan Corbet

On Fri, Mar 13, 2015 at 12:03 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, Mar 13, 2015 at 11:52 AM, Kees Cook <keescook@chromium.org> wrote:
>>
>> All this said, almost half of the capabilities, if passed to flawed
>> children with attacker controlled execution, can be elevated to full
>> root privileges pretty easily[1], so I think any documentation around
>> this feature should include some pretty dire warnings about using
>> this.
>
> That's a good point.  I'll make sure to document that.
>
> It's worth noting that, for many applications, that list is
> overstated.  For example, many of the suggested privilege escalations
> don't work if you're in a sufficiently restrictive mount namespace.
>
> For my own use, I plan on adding only CAP_NET_BIND_SERVICE and
> CAP_NET_RAW to pA, and I'll be layering seccomp on top to the extent
> possible.

Right, keeping software authors aware of the fact that their efforts
for attack surface reducing may need additional confinement beyond
just the capability reduction.

-Kees

>
> --Andy
>
>>
>> -Kees
>>
>> [1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522
>>
>> --
>> Kees Cook
>> Chrome OS Security
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-13 19:54                 ` Kees Cook
  0 siblings, 0 replies; 55+ messages in thread
From: Kees Cook @ 2015-03-13 19:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew G. Morgan, Jarkko Sakkinen, Ted Ts'o,
	Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar,
	Linux API, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Christoph Lameter, LKML, Serge Hallyn,
	Markku Savela, Jonathan Corbet

On Fri, Mar 13, 2015 at 12:03 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Fri, Mar 13, 2015 at 11:52 AM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>>
>> All this said, almost half of the capabilities, if passed to flawed
>> children with attacker controlled execution, can be elevated to full
>> root privileges pretty easily[1], so I think any documentation around
>> this feature should include some pretty dire warnings about using
>> this.
>
> That's a good point.  I'll make sure to document that.
>
> It's worth noting that, for many applications, that list is
> overstated.  For example, many of the suggested privilege escalations
> don't work if you're in a sufficiently restrictive mount namespace.
>
> For my own use, I plan on adding only CAP_NET_BIND_SERVICE and
> CAP_NET_RAW to pA, and I'll be layering seccomp on top to the extent
> possible.

Right, keeping software authors aware of the fact that their efforts
for attack surface reducing may need additional confinement beyond
just the capability reduction.

-Kees

>
> --Andy
>
>>
>> -Kees
>>
>> [1] https://forums.grsecurity.net/viewtopic.php?f=7&t=2522
>>
>> --
>> Kees Cook
>> Chrome OS Security
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 21:09             ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-14 21:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote:
>>
>> > It's to preserve the invariant that pA is always a subset of pI.
>>
>> But since a user can always raise a bit in pI if it is present in pP,
>> what does this invariant add to your model other than inconvenience?
>
> The useful part is that dropping a bit from pI also drops it from pA.
> To keep the model consistent, I also require that you add the bit to
> pI before adding it to pA.

So you are saying that pA is always a strict subset of pI (and pP)?
Then why not explicitly implement it as:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (fP & X) | (pI & [fI | (pA' & pP)] )

As it is you have so distributed these constraints that it is hard to
be sure it will remain that way.

>> >> I'm also unclear how you can turn off this new 'feature' for a process
>> >> tree? As it is, the code creates an exploit path for a capable (pP !=
>> >> 0) program with an exploitable flaw to create a privilege escalation
>> >> for an arbitrary child program.
>> >
>> > Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
>> > injects shellcode that does system("/bin/bash") into some pP != 0
>> > program, they don't actually elevate their privileges.  On the other
>> > hand, by the time an attacker injected shellcode for:
>> >
>> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
>> > system("/bin/bash");
>>
>> Let's call the above two lines [a] and [b]. With this patch, you are
>> encouraging folk to write programs that contain a line like [a]
>> already. So, yes, I am saying that you are creating an exploitable
>> path in these programs that says if someone can inject
>> system("/bin/bash") into these programs they can get a new (because of
>> this patch) privilege escalation.
>>
>> In the prevailing model, this kind of privilege escalation (resulting
>> from naive inheritance) is designed out. I recognize that you want to
>> add it back in, but I am concerned that you are not allowing for the
>> possibility that some folk might still want to still be able to
>> prevent it.
>
> If you have a program that deliberately uses PR_CAP_AMBIENT, then
> setting such a securebit will break the program, so it still doesn't
> buy you anything.

Not if you make the bit lockable (like the other bits). If you want to
run with your model in effect, you lock the enable bit on.

>> > into a target, they can already do whatever they want.
>> >
>> >> While I understand that everyone
>> >> 'knows what they are doing' in implementing this change, I'm convinced
>> >> that folk that are up to no good also do... Why not provide a lockable
>> >> secure bit to selectively disable this support?
>> >
>> > Show me a legitimate use case and I'll gladly implement a secure bit.
>>
>> Thanks. I was kind of hoping that you would add a lockable secure bit
>> that defaults this support to off, but can be used to turn it on with
>> or without a lock. That way, you can avoid disturbing the legacy
>> defaults (no surprises).
>
> I think this thing needs to default on to be useful.
>
>>
>> > In the mean time, I don't even believe that there's a legitimate use
>> > for any of the other secure bits (except keepcaps, and I don't know
>> > why that's a securebit in the first place).
>>
>> Those bits currently make it possible to run a subsystem with no
>> [set]uid-0 support in its process tree.
>
> Not usefully, because even with all the securebits set to their
> non-legacy modes, caps don't inherit, so it doesn't work.  I've tried.

Not sure I follow. They work for a definition if inheritable that you
seem to refuse to accept.

>> > In the mean time, see CVE-2014-3215 for an example of why securebits
>> > are probably more trouble than they're worth.
>>
>> I think it is safe to say that naive privilege inheritance has a fair
>> track record of being exploited orders of magnitude more frequently
>> than this. After all, these are the reasons LD_PRELOAD and shell
>> script setuid bits are suppressed.
>
> I don't know what you mean here by naive privilege inheritance.  The
> examples you're taking about aren't inheritance at all; they're
> exploring privilege *grants* during execve.  My patch deliberately
> leaves grants like that alone.

The pI set is inherited through this exec unmolested. You are adding a
pA set that is passed through exec somewhat unmolested but allows all
descendants to have privilege.

- Naive inheritance is 'if you had a privilege pre-exec any descendant
gets it post-exec'. This is what your pA achieves.

- Inheritance, as it is currently implemented, is implemented as a
handshake between a privileged ancestor (who added a pI bit) and a
descendant that is ready for it (has a fI bit to match).
  - the un-naive aspect (if you prefer 'selective' inheritance) of
this is that only binaries with file capabilities can ever wield
privilege, and while it can be suppressed by an intermediate
descendant the grant of inherited privilege can skip generations.

I think we can agree that the only place that inheritance can occur is
through an exec. So 'grants' through exec is sort of a semantic bit of
smoke masquerading as an argument. Even in the naive inheritance case
you have to explicitly grant some privilege - you are just choosing
different rules.

My Nack remains that you are eliminating the explicit enforcement of
selective inheritance. A lockable secure bit protecting access to your
prctl() function would address this concern.

Thanks

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 21:09             ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-14 21:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>
>> > It's to preserve the invariant that pA is always a subset of pI.
>>
>> But since a user can always raise a bit in pI if it is present in pP,
>> what does this invariant add to your model other than inconvenience?
>
> The useful part is that dropping a bit from pI also drops it from pA.
> To keep the model consistent, I also require that you add the bit to
> pI before adding it to pA.

So you are saying that pA is always a strict subset of pI (and pP)?
Then why not explicitly implement it as:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (fP & X) | (pI & [fI | (pA' & pP)] )

As it is you have so distributed these constraints that it is hard to
be sure it will remain that way.

>> >> I'm also unclear how you can turn off this new 'feature' for a process
>> >> tree? As it is, the code creates an exploit path for a capable (pP !=
>> >> 0) program with an exploitable flaw to create a privilege escalation
>> >> for an arbitrary child program.
>> >
>> > Huh?  If you exploit the parent, you already win.  Yes, if a kiddie
>> > injects shellcode that does system("/bin/bash") into some pP != 0
>> > program, they don't actually elevate their privileges.  On the other
>> > hand, by the time an attacker injected shellcode for:
>> >
>> > prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, CAP_SYS_ADMIN);
>> > system("/bin/bash");
>>
>> Let's call the above two lines [a] and [b]. With this patch, you are
>> encouraging folk to write programs that contain a line like [a]
>> already. So, yes, I am saying that you are creating an exploitable
>> path in these programs that says if someone can inject
>> system("/bin/bash") into these programs they can get a new (because of
>> this patch) privilege escalation.
>>
>> In the prevailing model, this kind of privilege escalation (resulting
>> from naive inheritance) is designed out. I recognize that you want to
>> add it back in, but I am concerned that you are not allowing for the
>> possibility that some folk might still want to still be able to
>> prevent it.
>
> If you have a program that deliberately uses PR_CAP_AMBIENT, then
> setting such a securebit will break the program, so it still doesn't
> buy you anything.

Not if you make the bit lockable (like the other bits). If you want to
run with your model in effect, you lock the enable bit on.

>> > into a target, they can already do whatever they want.
>> >
>> >> While I understand that everyone
>> >> 'knows what they are doing' in implementing this change, I'm convinced
>> >> that folk that are up to no good also do... Why not provide a lockable
>> >> secure bit to selectively disable this support?
>> >
>> > Show me a legitimate use case and I'll gladly implement a secure bit.
>>
>> Thanks. I was kind of hoping that you would add a lockable secure bit
>> that defaults this support to off, but can be used to turn it on with
>> or without a lock. That way, you can avoid disturbing the legacy
>> defaults (no surprises).
>
> I think this thing needs to default on to be useful.
>
>>
>> > In the mean time, I don't even believe that there's a legitimate use
>> > for any of the other secure bits (except keepcaps, and I don't know
>> > why that's a securebit in the first place).
>>
>> Those bits currently make it possible to run a subsystem with no
>> [set]uid-0 support in its process tree.
>
> Not usefully, because even with all the securebits set to their
> non-legacy modes, caps don't inherit, so it doesn't work.  I've tried.

Not sure I follow. They work for a definition if inheritable that you
seem to refuse to accept.

>> > In the mean time, see CVE-2014-3215 for an example of why securebits
>> > are probably more trouble than they're worth.
>>
>> I think it is safe to say that naive privilege inheritance has a fair
>> track record of being exploited orders of magnitude more frequently
>> than this. After all, these are the reasons LD_PRELOAD and shell
>> script setuid bits are suppressed.
>
> I don't know what you mean here by naive privilege inheritance.  The
> examples you're taking about aren't inheritance at all; they're
> exploring privilege *grants* during execve.  My patch deliberately
> leaves grants like that alone.

The pI set is inherited through this exec unmolested. You are adding a
pA set that is passed through exec somewhat unmolested but allows all
descendants to have privilege.

- Naive inheritance is 'if you had a privilege pre-exec any descendant
gets it post-exec'. This is what your pA achieves.

- Inheritance, as it is currently implemented, is implemented as a
handshake between a privileged ancestor (who added a pI bit) and a
descendant that is ready for it (has a fI bit to match).
  - the un-naive aspect (if you prefer 'selective' inheritance) of
this is that only binaries with file capabilities can ever wield
privilege, and while it can be suppressed by an intermediate
descendant the grant of inherited privilege can skip generations.

I think we can agree that the only place that inheritance can occur is
through an exec. So 'grants' through exec is sort of a semantic bit of
smoke masquerading as an argument. Even in the naive inheritance case
you have to explicitly grant some privilege - you are just choosing
different rules.

My Nack remains that you are eliminating the explicit enforcement of
selective inheritance. A lockable secure bit protecting access to your
prctl() function would address this concern.

Thanks

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-14 21:09             ` Andrew G. Morgan
  (?)
@ 2015-03-14 21:45             ` Andy Lutomirski
  2015-03-14 22:17                 ` Andrew G. Morgan
  -1 siblings, 1 reply; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-14 21:45 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote:
>>>
>>> > It's to preserve the invariant that pA is always a subset of pI.
>>>
>>> But since a user can always raise a bit in pI if it is present in pP,
>>> what does this invariant add to your model other than inconvenience?
>>
>> The useful part is that dropping a bit from pI also drops it from pA.
>> To keep the model consistent, I also require that you add the bit to
>> pI before adding it to pA.
>
> So you are saying that pA is always a strict subset of pI (and pP)?
> Then why not explicitly implement it as:
>
>   pA' = (file caps or setuid or setgid ? 0 : pA)
>   pP' = (fP & X) | (pI & [fI | (pA' & pP)] )
>
> As it is you have so distributed these constraints that it is hard to
> be sure it will remain that way.

That would be insecure.  If an attacker had pA = CAP_SYS_ADMIN, pI =
0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's
there's some other protection implemented, they could run some setuid
program, and that program could switch back to non-root, set pI = 0,
and call execve.  Unexpectedly, CAP_SYS_ADMIN would be inherited.

So I made the invariant explicit and added an assertion.

>>
>> If you have a program that deliberately uses PR_CAP_AMBIENT, then
>> setting such a securebit will break the program, so it still doesn't
>> buy you anything.
>
> Not if you make the bit lockable (like the other bits). If you want to
> run with your model in effect, you lock the enable bit on.

I don't see the point.  Again, this should be the default.


>>
>>>
>>> > In the mean time, I don't even believe that there's a legitimate use
>>> > for any of the other secure bits (except keepcaps, and I don't know
>>> > why that's a securebit in the first place).
>>>
>>> Those bits currently make it possible to run a subsystem with no
>>> [set]uid-0 support in its process tree.
>>
>> Not usefully, because even with all the securebits set to their
>> non-legacy modes, caps don't inherit, so it doesn't work.  I've tried.
>
> Not sure I follow. They work for a definition if inheritable that you
> seem to refuse to accept.

I, and everyone I know who's tried to use inheritable capabilities,
has run into the near-complete uselessness of the current model.  I
understand that a defunct POSIX draft specified it, but it's still
nearly useless.

You've objected to changing it, but you've never directly addressed
any of the reasons why Christoph, Google, and I all believe that we
can't usefully use it.

>>> I think it is safe to say that naive privilege inheritance has a fair
>>> track record of being exploited orders of magnitude more frequently
>>> than this. After all, these are the reasons LD_PRELOAD and shell
>>> script setuid bits are suppressed.
>>
>> I don't know what you mean here by naive privilege inheritance.  The
>> examples you're taking about aren't inheritance at all; they're
>> exploring privilege *grants* during execve.  My patch deliberately
>> leaves grants like that alone.
>
> The pI set is inherited through this exec unmolested.

This is flat-out useless.  Having pI = CAP_NET_BIND_SERVICE doesn't
let me bind low-numbered ports, full stop.

> My Nack remains that you are eliminating the explicit enforcement of
> selective inheritance. A lockable secure bit protecting access to your
> prctl() function would address this concern.

Would a sysctl or securebit that *optionally* allows pA to be disabled
satisfy you?

I don't understand why lockable is at all useful.  You'd need
CAP_SETPCAP to flip it regardless.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 22:17                 ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-14 22:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
>> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote:
>>>>
>>>> > It's to preserve the invariant that pA is always a subset of pI.
>>>>
>>>> But since a user can always raise a bit in pI if it is present in pP,
>>>> what does this invariant add to your model other than inconvenience?
>>>
>>> The useful part is that dropping a bit from pI also drops it from pA.
>>> To keep the model consistent, I also require that you add the bit to
>>> pI before adding it to pA.
>>
>> So you are saying that pA is always a strict subset of pI (and pP)?
>> Then why not explicitly implement it as:
>>
>>   pA' = (file caps or setuid or setgid ? 0 : pA)
>>   pP' = (fP & X) | (pI & [fI | (pA' & pP)] )
>>
>> As it is you have so distributed these constraints that it is hard to
>> be sure it will remain that way.
>
> That would be insecure.  If an attacker had pA = CAP_SYS_ADMIN, pI =
> 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's
> there's some other protection implemented, they could run some setuid
> program, and that program could switch back to non-root, set pI = 0,
> and call execve.  Unexpectedly, CAP_SYS_ADMIN would be inherited.

Forgive me, but which bit of

   pI & [fI | (pA' & pP)]

with pI = 0 makes that so?

> So I made the invariant explicit and added an assertion.
>
>>>
>>> If you have a program that deliberately uses PR_CAP_AMBIENT, then
>>> setting such a securebit will break the program, so it still doesn't
>>> buy you anything.
>>
>> Not if you make the bit lockable (like the other bits). If you want to
>> run with your model in effect, you lock the enable bit on.
>
> I don't see the point.  Again, this should be the default.
>
>
>>>
>>>>
>>>> > In the mean time, I don't even believe that there's a legitimate use
>>>> > for any of the other secure bits (except keepcaps, and I don't know
>>>> > why that's a securebit in the first place).
>>>>
>>>> Those bits currently make it possible to run a subsystem with no
>>>> [set]uid-0 support in its process tree.
>>>
>>> Not usefully, because even with all the securebits set to their
>>> non-legacy modes, caps don't inherit, so it doesn't work.  I've tried.
>>
>> Not sure I follow. They work for a definition if inheritable that you
>> seem to refuse to accept.
>
> I, and everyone I know who's tried to use inheritable capabilities,
> has run into the near-complete uselessness of the current model.  I
> understand that a defunct POSIX draft specified it, but it's still
> nearly useless.
>
> You've objected to changing it, but you've never directly addressed

I've repeatedly said I am not a fan of naive inheritance. I've not
objected to changing it, I've objected to mandating it be changed. I
have repeatedly suggested ways to conditionalize this feature
addition.

> any of the reasons why Christoph, Google, and I all believe that we
> can't usefully use it.

Working for Google, myself, I sort of find that a curious generalization.

>>>> I think it is safe to say that naive privilege inheritance has a fair
>>>> track record of being exploited orders of magnitude more frequently
>>>> than this. After all, these are the reasons LD_PRELOAD and shell
>>>> script setuid bits are suppressed.
>>>
>>> I don't know what you mean here by naive privilege inheritance.  The
>>> examples you're taking about aren't inheritance at all; they're
>>> exploring privilege *grants* during execve.  My patch deliberately
>>> leaves grants like that alone.
>>
>> The pI set is inherited through this exec unmolested.
>
> This is flat-out useless.  Having pI = CAP_NET_BIND_SERVICE doesn't
> let me bind low-numbered ports, full stop.

Even in your proposed model, neither pI nor pA does this. It is what
is in pE that counts.

>> My Nack remains that you are eliminating the explicit enforcement of
>> selective inheritance. A lockable secure bit protecting access to your
>> prctl() function would address this concern.
>
> Would a sysctl or securebit that *optionally* allows pA to be disabled
> satisfy you?
>
> I don't understand why lockable is at all useful.  You'd need
> CAP_SETPCAP to flip it regardless.

Because it means one can create process trees in which this behavior
is guaranteed to be allowed and/or disallowed.

Cheers

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 22:17                 ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-14 22:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>>>
>>>> > It's to preserve the invariant that pA is always a subset of pI.
>>>>
>>>> But since a user can always raise a bit in pI if it is present in pP,
>>>> what does this invariant add to your model other than inconvenience?
>>>
>>> The useful part is that dropping a bit from pI also drops it from pA.
>>> To keep the model consistent, I also require that you add the bit to
>>> pI before adding it to pA.
>>
>> So you are saying that pA is always a strict subset of pI (and pP)?
>> Then why not explicitly implement it as:
>>
>>   pA' = (file caps or setuid or setgid ? 0 : pA)
>>   pP' = (fP & X) | (pI & [fI | (pA' & pP)] )
>>
>> As it is you have so distributed these constraints that it is hard to
>> be sure it will remain that way.
>
> That would be insecure.  If an attacker had pA = CAP_SYS_ADMIN, pI =
> 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's
> there's some other protection implemented, they could run some setuid
> program, and that program could switch back to non-root, set pI = 0,
> and call execve.  Unexpectedly, CAP_SYS_ADMIN would be inherited.

Forgive me, but which bit of

   pI & [fI | (pA' & pP)]

with pI = 0 makes that so?

> So I made the invariant explicit and added an assertion.
>
>>>
>>> If you have a program that deliberately uses PR_CAP_AMBIENT, then
>>> setting such a securebit will break the program, so it still doesn't
>>> buy you anything.
>>
>> Not if you make the bit lockable (like the other bits). If you want to
>> run with your model in effect, you lock the enable bit on.
>
> I don't see the point.  Again, this should be the default.
>
>
>>>
>>>>
>>>> > In the mean time, I don't even believe that there's a legitimate use
>>>> > for any of the other secure bits (except keepcaps, and I don't know
>>>> > why that's a securebit in the first place).
>>>>
>>>> Those bits currently make it possible to run a subsystem with no
>>>> [set]uid-0 support in its process tree.
>>>
>>> Not usefully, because even with all the securebits set to their
>>> non-legacy modes, caps don't inherit, so it doesn't work.  I've tried.
>>
>> Not sure I follow. They work for a definition if inheritable that you
>> seem to refuse to accept.
>
> I, and everyone I know who's tried to use inheritable capabilities,
> has run into the near-complete uselessness of the current model.  I
> understand that a defunct POSIX draft specified it, but it's still
> nearly useless.
>
> You've objected to changing it, but you've never directly addressed

I've repeatedly said I am not a fan of naive inheritance. I've not
objected to changing it, I've objected to mandating it be changed. I
have repeatedly suggested ways to conditionalize this feature
addition.

> any of the reasons why Christoph, Google, and I all believe that we
> can't usefully use it.

Working for Google, myself, I sort of find that a curious generalization.

>>>> I think it is safe to say that naive privilege inheritance has a fair
>>>> track record of being exploited orders of magnitude more frequently
>>>> than this. After all, these are the reasons LD_PRELOAD and shell
>>>> script setuid bits are suppressed.
>>>
>>> I don't know what you mean here by naive privilege inheritance.  The
>>> examples you're taking about aren't inheritance at all; they're
>>> exploring privilege *grants* during execve.  My patch deliberately
>>> leaves grants like that alone.
>>
>> The pI set is inherited through this exec unmolested.
>
> This is flat-out useless.  Having pI = CAP_NET_BIND_SERVICE doesn't
> let me bind low-numbered ports, full stop.

Even in your proposed model, neither pI nor pA does this. It is what
is in pE that counts.

>> My Nack remains that you are eliminating the explicit enforcement of
>> selective inheritance. A lockable secure bit protecting access to your
>> prctl() function would address this concern.
>
> Would a sysctl or securebit that *optionally* allows pA to be disabled
> satisfy you?
>
> I don't understand why lockable is at all useful.  You'd need
> CAP_SETPCAP to flip it regardless.

Because it means one can create process trees in which this behavior
is guaranteed to be allowed and/or disallowed.

Cheers

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 22:53                   ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-14 22:53 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
>>> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan@kernel.org> wrote:
>>>>>
>>>>> > It's to preserve the invariant that pA is always a subset of pI.
>>>>>
>>>>> But since a user can always raise a bit in pI if it is present in pP,
>>>>> what does this invariant add to your model other than inconvenience?
>>>>
>>>> The useful part is that dropping a bit from pI also drops it from pA.
>>>> To keep the model consistent, I also require that you add the bit to
>>>> pI before adding it to pA.
>>>
>>> So you are saying that pA is always a strict subset of pI (and pP)?
>>> Then why not explicitly implement it as:
>>>
>>>   pA' = (file caps or setuid or setgid ? 0 : pA)
>>>   pP' = (fP & X) | (pI & [fI | (pA' & pP)] )
>>>
>>> As it is you have so distributed these constraints that it is hard to
>>> be sure it will remain that way.
>>
>> That would be insecure.  If an attacker had pA = CAP_SYS_ADMIN, pI =
>> 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's
>> there's some other protection implemented, they could run some setuid
>> program, and that program could switch back to non-root, set pI = 0,
>> and call execve.  Unexpectedly, CAP_SYS_ADMIN would be inherited.
>
> Forgive me, but which bit of
>
>    pI & [fI | (pA' & pP)]
>
> with pI = 0 makes that so?

Oh, I misread that.

I think it's already unnecessarily confusing that you can inherit
nonzero pI without inheriting the corresponding bits in pP, and I
don't want to add yet more degrees of freedom in non-permitted caps.

>>>>
>>>> I don't know what you mean here by naive privilege inheritance.  The
>>>> examples you're taking about aren't inheritance at all; they're
>>>> exploring privilege *grants* during execve.  My patch deliberately
>>>> leaves grants like that alone.
>>>
>>> The pI set is inherited through this exec unmolested.
>>
>> This is flat-out useless.  Having pI = CAP_NET_BIND_SERVICE doesn't
>> let me bind low-numbered ports, full stop.
>
> Even in your proposed model, neither pI nor pA does this. It is what
> is in pE that counts.

Let me state it more precisely.  If your parent puts
CAP_NET_BIND_SERVICE in pI and execs you, you can't bind low-numbered
ports.  If your parent puts CAP_NET_BIND_SERVICE in pA, you can.

>
>>> My Nack remains that you are eliminating the explicit enforcement of
>>> selective inheritance. A lockable secure bit protecting access to your
>>> prctl() function would address this concern.
>>
>> Would a sysctl or securebit that *optionally* allows pA to be disabled
>> satisfy you?
>>
>> I don't understand why lockable is at all useful.  You'd need
>> CAP_SETPCAP to flip it regardless.
>
> Because it means one can create process trees in which this behavior
> is guaranteed to be allowed and/or disallowed.
>

I don't see why guaranteeing that it's allowed is particularly useful.
I also don't see why any of the securebits lock bits are useful.  I
don't specifically object; I just don't see the point.  If you give a
subtree CAP_SETPCAP but you don't trust them enough to leave
securebits unlocked, then I think your threat model is confused.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 22:53                   ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-14 22:53 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>> On Fri, Mar 13, 2015 at 10:57 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>>>> On Mar 13, 2015 6:24 AM, "Andrew G. Morgan" <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>>>>
>>>>> > It's to preserve the invariant that pA is always a subset of pI.
>>>>>
>>>>> But since a user can always raise a bit in pI if it is present in pP,
>>>>> what does this invariant add to your model other than inconvenience?
>>>>
>>>> The useful part is that dropping a bit from pI also drops it from pA.
>>>> To keep the model consistent, I also require that you add the bit to
>>>> pI before adding it to pA.
>>>
>>> So you are saying that pA is always a strict subset of pI (and pP)?
>>> Then why not explicitly implement it as:
>>>
>>>   pA' = (file caps or setuid or setgid ? 0 : pA)
>>>   pP' = (fP & X) | (pI & [fI | (pA' & pP)] )
>>>
>>> As it is you have so distributed these constraints that it is hard to
>>> be sure it will remain that way.
>>
>> That would be insecure.  If an attacker had pA = CAP_SYS_ADMIN, pI =
>> 0, pP = 0 (i.e. no privs but pA is set somehow) then, unless that's
>> there's some other protection implemented, they could run some setuid
>> program, and that program could switch back to non-root, set pI = 0,
>> and call execve.  Unexpectedly, CAP_SYS_ADMIN would be inherited.
>
> Forgive me, but which bit of
>
>    pI & [fI | (pA' & pP)]
>
> with pI = 0 makes that so?

Oh, I misread that.

I think it's already unnecessarily confusing that you can inherit
nonzero pI without inheriting the corresponding bits in pP, and I
don't want to add yet more degrees of freedom in non-permitted caps.

>>>>
>>>> I don't know what you mean here by naive privilege inheritance.  The
>>>> examples you're taking about aren't inheritance at all; they're
>>>> exploring privilege *grants* during execve.  My patch deliberately
>>>> leaves grants like that alone.
>>>
>>> The pI set is inherited through this exec unmolested.
>>
>> This is flat-out useless.  Having pI = CAP_NET_BIND_SERVICE doesn't
>> let me bind low-numbered ports, full stop.
>
> Even in your proposed model, neither pI nor pA does this. It is what
> is in pE that counts.

Let me state it more precisely.  If your parent puts
CAP_NET_BIND_SERVICE in pI and execs you, you can't bind low-numbered
ports.  If your parent puts CAP_NET_BIND_SERVICE in pA, you can.

>
>>> My Nack remains that you are eliminating the explicit enforcement of
>>> selective inheritance. A lockable secure bit protecting access to your
>>> prctl() function would address this concern.
>>
>> Would a sysctl or securebit that *optionally* allows pA to be disabled
>> satisfy you?
>>
>> I don't understand why lockable is at all useful.  You'd need
>> CAP_SETPCAP to flip it regardless.
>
> Because it means one can create process trees in which this behavior
> is guaranteed to be allowed and/or disallowed.
>

I don't see why guaranteeing that it's allowed is particularly useful.
I also don't see why any of the securebits lock bits are useful.  I
don't specifically object; I just don't see the point.  If you give a
subtree CAP_SETPCAP but you don't trust them enough to leave
securebits unlocked, then I think your threat model is confused.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 22:55                   ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-14 22:55 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

It occurs to me that my previous reply was unnecessarily long and
missed the point.  Trying again:

On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
>>> My Nack remains that you are eliminating the explicit enforcement of
>>> selective inheritance. A lockable secure bit protecting access to your
>>> prctl() function would address this concern.
>>
>> Would a sysctl or securebit that *optionally* allows pA to be disabled
>> satisfy you?

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It would be kind of nice to remove your nack.  I think that the above
is the relevant question.  Could you answer it?

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-14 22:55                   ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-14 22:55 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

It occurs to me that my previous reply was unnecessarily long and
missed the point.  Trying again:

On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>> My Nack remains that you are eliminating the explicit enforcement of
>>> selective inheritance. A lockable secure bit protecting access to your
>>> prctl() function would address this concern.
>>
>> Would a sysctl or securebit that *optionally* allows pA to be disabled
>> satisfy you?

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It would be kind of nice to remove your nack.  I think that the above
is the relevant question.  Could you answer it?

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-15  0:04                     ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-15  0:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 3:55 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> It occurs to me that my previous reply was unnecessarily long and
> missed the point.  Trying again:
>
> On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
>> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan@kernel.org> wrote:
>>>> My Nack remains that you are eliminating the explicit enforcement of
>>>> selective inheritance. A lockable secure bit protecting access to your
>>>> prctl() function would address this concern.
>>>
>>> Would a sysctl or securebit that *optionally* allows pA to be disabled
>>> satisfy you?
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> It would be kind of nice to remove your nack.  I think that the above
> is the relevant question.  Could you answer it?

I thought I did. Please implement a lockable secure bit and I will
remove my Nack.

I don't understand your confusion. Perhaps you are defining
*optionally* in a way I don't follow? Perhaps you are saying that you
want the default behavior of kernels to change to include your new
feature, and that you want the secure bit, when set, would put it into
pA suppressed mode?

Other folk are probably better at anticipating the ramifications of
changing default behavior. I'd prefer you default to pA being off, but
I shall remove my Nack either way if you implement a lockable secure
bit.

Clear enough?

Thanks

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-15  0:04                     ` Andrew G. Morgan
  0 siblings, 0 replies; 55+ messages in thread
From: Andrew G. Morgan @ 2015-03-15  0:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew Lutomirski, Andrew Morton,
	Michael Kerrisk, Mimi Zohar, Linux API, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Christoph Lameter, LKML,
	Serge Hallyn, Markku Savela, Kees Cook, Jonathan Corbet

On Sat, Mar 14, 2015 at 3:55 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> It occurs to me that my previous reply was unnecessarily long and
> missed the point.  Trying again:
>
> On Sat, Mar 14, 2015 at 3:17 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>> On Sat, Mar 14, 2015 at 2:45 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>>> On Sat, Mar 14, 2015 at 2:09 PM, Andrew G. Morgan <morgan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>>>> My Nack remains that you are eliminating the explicit enforcement of
>>>> selective inheritance. A lockable secure bit protecting access to your
>>>> prctl() function would address this concern.
>>>
>>> Would a sysctl or securebit that *optionally* allows pA to be disabled
>>> satisfy you?
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> It would be kind of nice to remove your nack.  I think that the above
> is the relevant question.  Could you answer it?

I thought I did. Please implement a lockable secure bit and I will
remove my Nack.

I don't understand your confusion. Perhaps you are defining
*optionally* in a way I don't follow? Perhaps you are saying that you
want the default behavior of kernels to change to include your new
feature, and that you want the secure bit, when set, would put it into
pA suppressed mode?

Other folk are probably better at anticipating the ramifications of
changing default behavior. I'd prefer you default to pA being off, but
I shall remove my Nack either way if you implement a lockable secure
bit.

Clear enough?

Thanks

Andrew

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-03-15  0:04                     ` Andrew G. Morgan
  (?)
@ 2015-03-30 12:55                     ` Christoph Lameter
  2015-03-30 14:31                         ` Andy Lutomirski
  -1 siblings, 1 reply; 55+ messages in thread
From: Christoph Lameter @ 2015-03-30 12:55 UTC (permalink / raw)
  To: Andrew G. Morgan
  Cc: Andy Lutomirski, Jarkko Sakkinen, Ted Ts'o,
	Andrew Lutomirski, Andrew Morton, Michael Kerrisk, Mimi Zohar,
	Linux API, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, LKML, Serge Hallyn, Markku Savela, Kees Cook,
	Jonathan Corbet

On Sat, 14 Mar 2015, Andrew G. Morgan wrote:

>
> I thought I did. Please implement a lockable secure bit and I will

Would this suffice? It puts the CAP_SETPCAP limitation back to how it
was in my earlier patch.



Subject: ambient caps: Allow disabling with SETPCAP

Do not allow setting ambient caps if CAP_SETPCAP is not set.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/security/commoncap.c
===================================================================
--- linux.orig/security/commoncap.c
+++ linux/security/commoncap.c
@@ -962,6 +962,9 @@ int cap_task_prctl(int option, unsigned
 		if (((!cap_valid(arg3)) | arg4 | arg5))
 			return -EINVAL;

+		if (!ns_capable(current_user_ns(), CAP_SETPCAP))
+			return -EPERM;
+
 		if (arg2 == PR_CAP_AMBIENT_GET) {
 			return !!cap_raised(current_cred()->cap_ambient, arg3);
 		} else if (arg2 != PR_CAP_AMBIENT_RAISE &&

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-30 14:31                         ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-30 14:31 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Mar 30, 2015 7:55 AM, "Christoph Lameter" <cl@linux.com> wrote:
>
> On Sat, 14 Mar 2015, Andrew G. Morgan wrote:
>
> >
> > I thought I did. Please implement a lockable secure bit and I will
>
> Would this suffice? It puts the CAP_SETPCAP limitation back to how it
> was in my earlier patch.
>

I really don't like that variant.  CAP_SETPCAP is dangerous and so
absurdly powerful that people really shouldn't hand it out.

I'll submit a new version this week with the securebits.  Sorry for the delay.

--Andy

>
>
> Subject: ambient caps: Allow disabling with SETPCAP
>
> Do not allow setting ambient caps if CAP_SETPCAP is not set.
>
> Signed-off-by: Christoph Lameter <cl@linux.com>
>
> Index: linux/security/commoncap.c
> ===================================================================
> --- linux.orig/security/commoncap.c
> +++ linux/security/commoncap.c
> @@ -962,6 +962,9 @@ int cap_task_prctl(int option, unsigned
>                 if (((!cap_valid(arg3)) | arg4 | arg5))
>                         return -EINVAL;
>
> +               if (!ns_capable(current_user_ns(), CAP_SETPCAP))
> +                       return -EPERM;
> +
>                 if (arg2 == PR_CAP_AMBIENT_GET) {
>                         return !!cap_raised(current_cred()->cap_ambient, arg3);
>                 } else if (arg2 != PR_CAP_AMBIENT_RAISE &&

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-30 14:31                         ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-03-30 14:31 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Mar 30, 2015 7:55 AM, "Christoph Lameter" <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
>
> On Sat, 14 Mar 2015, Andrew G. Morgan wrote:
>
> >
> > I thought I did. Please implement a lockable secure bit and I will
>
> Would this suffice? It puts the CAP_SETPCAP limitation back to how it
> was in my earlier patch.
>

I really don't like that variant.  CAP_SETPCAP is dangerous and so
absurdly powerful that people really shouldn't hand it out.

I'll submit a new version this week with the securebits.  Sorry for the delay.

--Andy

>
>
> Subject: ambient caps: Allow disabling with SETPCAP
>
> Do not allow setting ambient caps if CAP_SETPCAP is not set.
>
> Signed-off-by: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
>
> Index: linux/security/commoncap.c
> ===================================================================
> --- linux.orig/security/commoncap.c
> +++ linux/security/commoncap.c
> @@ -962,6 +962,9 @@ int cap_task_prctl(int option, unsigned
>                 if (((!cap_valid(arg3)) | arg4 | arg5))
>                         return -EINVAL;
>
> +               if (!ns_capable(current_user_ns(), CAP_SETPCAP))
> +                       return -EPERM;
> +
>                 if (arg2 == PR_CAP_AMBIENT_GET) {
>                         return !!cap_raised(current_cred()->cap_ambient, arg3);
>                 } else if (arg2 != PR_CAP_AMBIENT_RAISE &&

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-30 15:05                           ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-03-30 15:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Mon, 30 Mar 2015, Andy Lutomirski wrote:

> > Would this suffice? It puts the CAP_SETPCAP limitation back to how it
> > was in my earlier patch.
> I really don't like that variant.  CAP_SETPCAP is dangerous and so
> absurdly powerful that people really shouldn't hand it out.

According to

	man 7 capabilities

CAP_SETPCAP is required to setup securebits.

This hides the functionality behind yet another stage of security and
obscures this ability somewhat more?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-03-30 15:05                           ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-03-30 15:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Mon, 30 Mar 2015, Andy Lutomirski wrote:

> > Would this suffice? It puts the CAP_SETPCAP limitation back to how it
> > was in my earlier patch.
> I really don't like that variant.  CAP_SETPCAP is dangerous and so
> absurdly powerful that people really shouldn't hand it out.

According to

	man 7 capabilities

CAP_SETPCAP is required to setup securebits.

This hides the functionality behind yet another stage of security and
obscures this ability somewhat more?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-09 15:25                           ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-04-09 15:25 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Mon, 30 Mar 2015, Andy Lutomirski wrote:

> I'll submit a new version this week with the securebits.  Sorry for the delay.

Are we going to get a new version?


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-09 15:25                           ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-04-09 15:25 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Mon, 30 Mar 2015, Andy Lutomirski wrote:

> I'll submit a new version this week with the securebits.  Sorry for the delay.

Are we going to get a new version?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-23 14:01                             ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-04-23 14:01 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Thu, 9 Apr 2015, Christoph Lameter wrote:

> > I'll submit a new version this week with the securebits.  Sorry for the delay.
 > Are we going to get a new version?

Replying to my own here. Cant we simply use the SETPCAP approach as per
the patch I posted?


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-23 14:01                             ` Christoph Lameter
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-04-23 14:01 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o, Andrew Morton,
	Andrew G. Morgan, Linux API, Mimi Zohar, Michael Kerrisk,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Serge Hallyn, LKML, Markku Savela, Kees Cook, Jonathan Corbet

On Thu, 9 Apr 2015, Christoph Lameter wrote:

> > I'll submit a new version this week with the securebits.  Sorry for the delay.
 > Are we going to get a new version?

Replying to my own here. Cant we simply use the SETPCAP approach as per
the patch I posted?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 17:53                               ` Serge Hallyn
  0 siblings, 0 replies; 55+ messages in thread
From: Serge Hallyn @ 2015-04-24 17:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andy Lutomirski, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

Quoting Christoph Lameter (cl@linux.com):
> On Thu, 9 Apr 2015, Christoph Lameter wrote:
> 
> > > I'll submit a new version this week with the securebits.  Sorry for the delay.
>  > Are we going to get a new version?
> 
> Replying to my own here. Cant we simply use the SETPCAP approach as per
> the patch I posted?

Andy had objections to that, but it seems ok to me.

-serge

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 17:53                               ` Serge Hallyn
  0 siblings, 0 replies; 55+ messages in thread
From: Serge Hallyn @ 2015-04-24 17:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andy Lutomirski, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

Quoting Christoph Lameter (cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org):
> On Thu, 9 Apr 2015, Christoph Lameter wrote:
> 
> > > I'll submit a new version this week with the securebits.  Sorry for the delay.
>  > Are we going to get a new version?
> 
> Replying to my own here. Cant we simply use the SETPCAP approach as per
> the patch I posted?

Andy had objections to that, but it seems ok to me.

-serge

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-04-24 17:53                               ` Serge Hallyn
  (?)
@ 2015-04-24 18:41                               ` Andy Lutomirski
  2015-04-24 19:09                                   ` Serge Hallyn
  -1 siblings, 1 reply; 55+ messages in thread
From: Andy Lutomirski @ 2015-04-24 18:41 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> Quoting Christoph Lameter (cl@linux.com):
>> On Thu, 9 Apr 2015, Christoph Lameter wrote:
>>
>> > > I'll submit a new version this week with the securebits.  Sorry for the delay.
>>  > Are we going to get a new version?
>>
>> Replying to my own here. Cant we simply use the SETPCAP approach as per
>> the patch I posted?
>
> Andy had objections to that, but it seems ok to me.
>

I object because CAP_SETPCAP is very powerful whereas
CAP_NET_BIND_SERVICE, for example, isn't.  I'm fine with having a
switch to turn off ambient caps, but requiring the "on" state to give
processes superpowers seems unfortunate.

Sorry for the huge delay.  I got caught up with travel and the merge
window.  Here's a sneak peek:

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=cap_ambient

I need to write the user code to go with it and test it a bit before
sending it out for real.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 19:09                                   ` Serge Hallyn
  0 siblings, 0 replies; 55+ messages in thread
From: Serge Hallyn @ 2015-04-24 19:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

Quoting Andy Lutomirski (luto@amacapital.net):
> On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> > Quoting Christoph Lameter (cl@linux.com):
> >> On Thu, 9 Apr 2015, Christoph Lameter wrote:
> >>
> >> > > I'll submit a new version this week with the securebits.  Sorry for the delay.
> >>  > Are we going to get a new version?
> >>
> >> Replying to my own here. Cant we simply use the SETPCAP approach as per
> >> the patch I posted?
> >
> > Andy had objections to that, but it seems ok to me.
> >
> 
> I object because CAP_SETPCAP is very powerful whereas
> CAP_NET_BIND_SERVICE, for example, isn't.  I'm fine with having a
> switch to turn off ambient caps, but requiring the "on" state to give

Would only really be needed for the initial 'enable ambient caps for this
process tree', though.  Once that was set, add/remove'ing caps from the
ambient set wouldn't need to be required.

> processes superpowers seems unfortunate.
> 
> Sorry for the huge delay.  I got caught up with travel and the merge
> window.  Here's a sneak peek:
> 
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=cap_ambient
> 
> I need to write the user code to go with it and test it a bit before
> sending it out for real.

Ok, thanks

-serge

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 19:09                                   ` Serge Hallyn
  0 siblings, 0 replies; 55+ messages in thread
From: Serge Hallyn @ 2015-04-24 19:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org):
> On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote:
> > Quoting Christoph Lameter (cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org):
> >> On Thu, 9 Apr 2015, Christoph Lameter wrote:
> >>
> >> > > I'll submit a new version this week with the securebits.  Sorry for the delay.
> >>  > Are we going to get a new version?
> >>
> >> Replying to my own here. Cant we simply use the SETPCAP approach as per
> >> the patch I posted?
> >
> > Andy had objections to that, but it seems ok to me.
> >
> 
> I object because CAP_SETPCAP is very powerful whereas
> CAP_NET_BIND_SERVICE, for example, isn't.  I'm fine with having a
> switch to turn off ambient caps, but requiring the "on" state to give

Would only really be needed for the initial 'enable ambient caps for this
process tree', though.  Once that was set, add/remove'ing caps from the
ambient set wouldn't need to be required.

> processes superpowers seems unfortunate.
> 
> Sorry for the huge delay.  I got caught up with travel and the merge
> window.  Here's a sneak peek:
> 
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=cap_ambient
> 
> I need to write the user code to go with it and test it a bit before
> sending it out for real.

Ok, thanks

-serge

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-04-24 19:09                                   ` Serge Hallyn
  (?)
@ 2015-04-24 19:25                                   ` Christoph Lameter
  -1 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-04-24 19:25 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Andy Lutomirski, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

On Fri, 24 Apr 2015, Serge Hallyn wrote:

> > I object because CAP_SETPCAP is very powerful whereas
> > CAP_NET_BIND_SERVICE, for example, isn't.  I'm fine with having a
> > switch to turn off ambient caps, but requiring the "on" state to give
>
> Would only really be needed for the initial 'enable ambient caps for this
> process tree', though.  Once that was set, add/remove'ing caps from the
> ambient set wouldn't need to be required.

Exactly. Its much simpler than alternate approaches. And its convoluted
enough.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-04-24 19:09                                   ` Serge Hallyn
  (?)
  (?)
@ 2015-04-24 19:53                                   ` Andy Lutomirski
  2015-04-24 20:13                                     ` Christoph Lameter
  -1 siblings, 1 reply; 55+ messages in thread
From: Andy Lutomirski @ 2015-04-24 19:53 UTC (permalink / raw)
  To: Serge Hallyn
  Cc: Christoph Lameter, Jarkko Sakkinen, Andrew Lutomirski,
	Ted Ts'o, Andrew Morton, Andrew G. Morgan, Linux API,
	Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

On Fri, Apr 24, 2015 at 12:09 PM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
> Quoting Andy Lutomirski (luto@amacapital.net):
>> On Fri, Apr 24, 2015 at 10:53 AM, Serge Hallyn <serge.hallyn@ubuntu.com> wrote:
>> > Quoting Christoph Lameter (cl@linux.com):
>> >> On Thu, 9 Apr 2015, Christoph Lameter wrote:
>> >>
>> >> > > I'll submit a new version this week with the securebits.  Sorry for the delay.
>> >>  > Are we going to get a new version?
>> >>
>> >> Replying to my own here. Cant we simply use the SETPCAP approach as per
>> >> the patch I posted?
>> >
>> > Andy had objections to that, but it seems ok to me.
>> >
>>
>> I object because CAP_SETPCAP is very powerful whereas
>> CAP_NET_BIND_SERVICE, for example, isn't.  I'm fine with having a
>> switch to turn off ambient caps, but requiring the "on" state to give
>
> Would only really be needed for the initial 'enable ambient caps for this
> process tree', though.  Once that was set, add/remove'ing caps from the
> ambient set wouldn't need to be required.

That's sort of what my patch does -- you need CAP_SETPCAP to switch
the securebit.

But Christoph's patch required it to add caps to the ambient set, right?

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-04-24 19:53                                   ` Andy Lutomirski
@ 2015-04-24 20:13                                     ` Christoph Lameter
  2015-04-24 20:18                                         ` Andy Lutomirski
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Lameter @ 2015-04-24 20:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o,
	Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar,
	Michael Kerrisk, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook,
	Jonathan Corbet

On Fri, 24 Apr 2015, Andy Lutomirski wrote:

> That's sort of what my patch does -- you need CAP_SETPCAP to switch
> the securebit.
>
> But Christoph's patch required it to add caps to the ambient set, right?

Yes but you seem to be just adding one additional step without too much of
a benefit because you still need CAP_SETPCAP.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 20:18                                         ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-04-24 20:18 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o,
	Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar,
	Michael Kerrisk, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook,
	Jonathan Corbet

On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl@linux.com> wrote:
> On Fri, 24 Apr 2015, Andy Lutomirski wrote:
>
>> That's sort of what my patch does -- you need CAP_SETPCAP to switch
>> the securebit.
>>
>> But Christoph's patch required it to add caps to the ambient set, right?
>
> Yes but you seem to be just adding one additional step without too much of
> a benefit because you still need CAP_SETPCAP.
>

No, because I set the default to on :)

Also, in my model you can do:

$ sudo capset cap_whatever=eip something
$ ./something

and the program can make its cap be ambient and run a helper.  In the
CAP_SETPCAP model, that doesn't work.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 20:18                                         ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-04-24 20:18 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o,
	Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar,
	Michael Kerrisk, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook,
	Jonathan Corbet

On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, 24 Apr 2015, Andy Lutomirski wrote:
>
>> That's sort of what my patch does -- you need CAP_SETPCAP to switch
>> the securebit.
>>
>> But Christoph's patch required it to add caps to the ambient set, right?
>
> Yes but you seem to be just adding one additional step without too much of
> a benefit because you still need CAP_SETPCAP.
>

No, because I set the default to on :)

Also, in my model you can do:

$ sudo capset cap_whatever=eip something
$ ./something

and the program can make its cap be ambient and run a helper.  In the
CAP_SETPCAP model, that doesn't work.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-04-24 20:18                                         ` Andy Lutomirski
  (?)
@ 2015-04-24 21:15                                         ` Serge E. Hallyn
  2015-04-24 22:55                                             ` Andy Lutomirski
  -1 siblings, 1 reply; 55+ messages in thread
From: Serge E. Hallyn @ 2015-04-24 21:15 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Christoph Lameter, Serge Hallyn, Jarkko Sakkinen,
	Andrew Lutomirski, Ted Ts'o, Andrew Morton, Andrew G. Morgan,
	Linux API, Mimi Zohar, Michael Kerrisk, Austin S Hemmelgarn,
	linux-security-module, Aaron Jones, Serge Hallyn, LKML,
	Markku Savela, Kees Cook, Jonathan Corbet

On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote:
> On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl@linux.com> wrote:
> > On Fri, 24 Apr 2015, Andy Lutomirski wrote:
> >
> >> That's sort of what my patch does -- you need CAP_SETPCAP to switch
> >> the securebit.
> >>
> >> But Christoph's patch required it to add caps to the ambient set, right?
> >
> > Yes but you seem to be just adding one additional step without too much of
> > a benefit because you still need CAP_SETPCAP.
> >
> 
> No, because I set the default to on :)

Right - I definately prefer

. default off
. CAP_SETPCAP required to turn it on (for self and children)
. once on, anyone can copy bits from (whatever we decided) to pA.

> Also, in my model you can do:
> 
> $ sudo capset cap_whatever=eip something
> $ ./something
> 
> and the program can make its cap be ambient and run a helper.  In the
> CAP_SETPCAP model, that doesn't work.
> 
> --Andy
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 22:55                                             ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-04-24 22:55 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew G. Morgan, Andrew Morton,
	Serge Hallyn, Michael Kerrisk, Mimi Zohar, Linux API,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook,
	Jonathan Corbet

On Apr 24, 2015 2:15 PM, "Serge E. Hallyn" <serge@hallyn.com> wrote:
>
> On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote:
> > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl@linux.com> wrote:
> > > On Fri, 24 Apr 2015, Andy Lutomirski wrote:
> > >
> > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch
> > >> the securebit.
> > >>
> > >> But Christoph's patch required it to add caps to the ambient set, right?
> > >
> > > Yes but you seem to be just adding one additional step without too much of
> > > a benefit because you still need CAP_SETPCAP.
> > >
> >
> > No, because I set the default to on :)
>
> Right - I definately prefer
>
> . default off
> . CAP_SETPCAP required to turn it on (for self and children)
> . once on, anyone can copy bits from (whatever we decided) to pA.
>

Why default off?  If there's some weird reason that switching it on
could cause a security problem, then I'd agree, but I haven't spotted
a reason yet.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-24 22:55                                             ` Andy Lutomirski
  0 siblings, 0 replies; 55+ messages in thread
From: Andy Lutomirski @ 2015-04-24 22:55 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Jarkko Sakkinen, Ted Ts'o, Andrew G. Morgan, Andrew Morton,
	Serge Hallyn, Michael Kerrisk, Mimi Zohar, Linux API,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook,
	Jonathan Corbet

On Apr 24, 2015 2:15 PM, "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote:
>
> On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote:
> > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> > > On Fri, 24 Apr 2015, Andy Lutomirski wrote:
> > >
> > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch
> > >> the securebit.
> > >>
> > >> But Christoph's patch required it to add caps to the ambient set, right?
> > >
> > > Yes but you seem to be just adding one additional step without too much of
> > > a benefit because you still need CAP_SETPCAP.
> > >
> >
> > No, because I set the default to on :)
>
> Right - I definately prefer
>
> . default off
> . CAP_SETPCAP required to turn it on (for self and children)
> . once on, anyone can copy bits from (whatever we decided) to pA.
>

Why default off?  If there's some weird reason that switching it on
could cause a security problem, then I'd agree, but I haven't spotted
a reason yet.

--Andy

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-25  2:45                                               ` Serge Hallyn
  0 siblings, 0 replies; 55+ messages in thread
From: Serge Hallyn @ 2015-04-25  2:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Serge E. Hallyn, Jarkko Sakkinen, Ted Ts'o, Andrew G. Morgan,
	Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook,
	Jonathan Corbet

Quoting Andy Lutomirski (luto@amacapital.net):
> On Apr 24, 2015 2:15 PM, "Serge E. Hallyn" <serge@hallyn.com> wrote:
> >
> > On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote:
> > > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl@linux.com> wrote:
> > > > On Fri, 24 Apr 2015, Andy Lutomirski wrote:
> > > >
> > > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch
> > > >> the securebit.
> > > >>
> > > >> But Christoph's patch required it to add caps to the ambient set, right?
> > > >
> > > > Yes but you seem to be just adding one additional step without too much of
> > > > a benefit because you still need CAP_SETPCAP.
> > > >
> > >
> > > No, because I set the default to on :)
> >
> > Right - I definately prefer
> >
> > . default off
> > . CAP_SETPCAP required to turn it on (for self and children)
> > . once on, anyone can copy bits from (whatever we decided) to pA.
> >
> 
> Why default off?  If there's some weird reason that switching it on
> could cause a security problem, then I'd agree, but I haven't spotted
> a reason yet.

Cause it's less scary?

I'll just wait for the new patchset :)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
@ 2015-04-25  2:45                                               ` Serge Hallyn
  0 siblings, 0 replies; 55+ messages in thread
From: Serge Hallyn @ 2015-04-25  2:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Serge E. Hallyn, Jarkko Sakkinen, Ted Ts'o, Andrew G. Morgan,
	Andrew Morton, Michael Kerrisk, Mimi Zohar, Linux API,
	Austin S Hemmelgarn, linux-security-module, Aaron Jones,
	Christoph Lameter, LKML, Serge Hallyn, Markku Savela, Kees Cook,
	Jonathan Corbet

Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org):
> On Apr 24, 2015 2:15 PM, "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote:
> >
> > On Fri, Apr 24, 2015 at 01:18:44PM -0700, Andy Lutomirski wrote:
> > > On Fri, Apr 24, 2015 at 1:13 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> > > > On Fri, 24 Apr 2015, Andy Lutomirski wrote:
> > > >
> > > >> That's sort of what my patch does -- you need CAP_SETPCAP to switch
> > > >> the securebit.
> > > >>
> > > >> But Christoph's patch required it to add caps to the ambient set, right?
> > > >
> > > > Yes but you seem to be just adding one additional step without too much of
> > > > a benefit because you still need CAP_SETPCAP.
> > > >
> > >
> > > No, because I set the default to on :)
> >
> > Right - I definately prefer
> >
> > . default off
> > . CAP_SETPCAP required to turn it on (for self and children)
> > . once on, anyone can copy bits from (whatever we decided) to pA.
> >
> 
> Why default off?  If there's some weird reason that switching it on
> could cause a security problem, then I'd agree, but I haven't spotted
> a reason yet.

Cause it's less scary?

I'll just wait for the new patchset :)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [RFC] capabilities: Ambient capabilities
  2015-04-24 20:18                                         ` Andy Lutomirski
  (?)
  (?)
@ 2015-04-27 14:55                                         ` Christoph Lameter
  -1 siblings, 0 replies; 55+ messages in thread
From: Christoph Lameter @ 2015-04-27 14:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Serge Hallyn, Jarkko Sakkinen, Andrew Lutomirski, Ted Ts'o,
	Andrew Morton, Andrew G. Morgan, Linux API, Mimi Zohar,
	Michael Kerrisk, Austin S Hemmelgarn, linux-security-module,
	Aaron Jones, Serge Hallyn, LKML, Markku Savela, Kees Cook,
	Jonathan Corbet

On Fri, 24 Apr 2015, Andy Lutomirski wrote:

> Also, in my model you can do:
>
> $ sudo capset cap_whatever=eip something
> $ ./something
>
> and the program can make its cap be ambient and run a helper.  In the
> CAP_SETPCAP model, that doesn't work.

Dont see too much difference in setting caps and CAP_SETPCAP on
"./something" to allow it to set the ambient caps.

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2015-04-27 14:56 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-12 18:08 [RFC] capabilities: Ambient capabilities Andy Lutomirski
2015-03-12 18:08 ` Andy Lutomirski
2015-03-12 21:49 ` Kees Cook
2015-03-12 22:10   ` Andrew G. Morgan
2015-03-12 22:27     ` Andrew Lutomirski
2015-03-12 22:27       ` Andrew Lutomirski
2015-03-13 13:24       ` Andrew G. Morgan
2015-03-13 13:24         ` Andrew G. Morgan
2015-03-13 16:06         ` Christoph Lameter
2015-03-13 17:58           ` Andy Lutomirski
2015-03-13 17:58             ` Andy Lutomirski
2015-03-13 18:52             ` Christoph Lameter
2015-03-13 18:52               ` Christoph Lameter
2015-03-13 17:57         ` Andy Lutomirski
2015-03-13 18:52           ` Kees Cook
2015-03-13 18:52             ` Kees Cook
2015-03-13 19:03             ` Andy Lutomirski
2015-03-13 19:54               ` Kees Cook
2015-03-13 19:54                 ` Kees Cook
2015-03-14 21:09           ` Andrew G. Morgan
2015-03-14 21:09             ` Andrew G. Morgan
2015-03-14 21:45             ` Andy Lutomirski
2015-03-14 22:17               ` Andrew G. Morgan
2015-03-14 22:17                 ` Andrew G. Morgan
2015-03-14 22:53                 ` Andy Lutomirski
2015-03-14 22:53                   ` Andy Lutomirski
2015-03-14 22:55                 ` Andy Lutomirski
2015-03-14 22:55                   ` Andy Lutomirski
2015-03-15  0:04                   ` Andrew G. Morgan
2015-03-15  0:04                     ` Andrew G. Morgan
2015-03-30 12:55                     ` Christoph Lameter
2015-03-30 14:31                       ` Andy Lutomirski
2015-03-30 14:31                         ` Andy Lutomirski
2015-03-30 15:05                         ` Christoph Lameter
2015-03-30 15:05                           ` Christoph Lameter
2015-04-09 15:25                         ` Christoph Lameter
2015-04-09 15:25                           ` Christoph Lameter
2015-04-23 14:01                           ` Christoph Lameter
2015-04-23 14:01                             ` Christoph Lameter
2015-04-24 17:53                             ` Serge Hallyn
2015-04-24 17:53                               ` Serge Hallyn
2015-04-24 18:41                               ` Andy Lutomirski
2015-04-24 19:09                                 ` Serge Hallyn
2015-04-24 19:09                                   ` Serge Hallyn
2015-04-24 19:25                                   ` Christoph Lameter
2015-04-24 19:53                                   ` Andy Lutomirski
2015-04-24 20:13                                     ` Christoph Lameter
2015-04-24 20:18                                       ` Andy Lutomirski
2015-04-24 20:18                                         ` Andy Lutomirski
2015-04-24 21:15                                         ` Serge E. Hallyn
2015-04-24 22:55                                           ` Andy Lutomirski
2015-04-24 22:55                                             ` Andy Lutomirski
2015-04-25  2:45                                             ` Serge Hallyn
2015-04-25  2:45                                               ` Serge Hallyn
2015-04-27 14:55                                         ` Christoph Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.