linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] LSM: add SafeSetID module that gates setid calls
@ 2018-10-31 15:28 mortonm
  2018-10-31 21:02 ` Serge E. Hallyn
  2018-11-02 18:07 ` [PATCH] " Stephen Smalley
  0 siblings, 2 replies; 88+ messages in thread
From: mortonm @ 2018-10-31 15:28 UTC (permalink / raw)
  To: jmorris, serge, keescook, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 28583 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---

NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
code that likely needs improvement before being an acceptable approach.
I'm specifically interested to see if there are better ideas for how
this could be done.

 Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 arch/Kconfig                                |   5 +
 arch/arm/Kconfig                            |   1 +
 arch/arm64/Kconfig                          |   1 +
 arch/x86/Kconfig                            |   1 +
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  13 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 ++
 security/safesetid/securityfs.c             | 189 +++++++++++
 13 files changed, 679 insertions(+)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..e7d072124424
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,94 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index c980dfe9abf1..a0c387649e12 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -39,3 +39,4 @@ the one "major" module (e.g. SELinux) if there is one configured.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/arch/Kconfig b/arch/Kconfig
index 1aa59063f1fd..c87070807ba2 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -381,6 +381,11 @@ config ARCH_WANT_OLD_COMPAT_IPC
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	bool
 
+config HAVE_SAFESETID
+	bool
+	help
+	  This option enables the SafeSetID LSM.
+
 config HAVE_ARCH_SECCOMP_FILTER
 	bool
 	help
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 843edfd000be..35b1a772c971 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -92,6 +92,7 @@ config ARM
 	select HAVE_RCU_TABLE_FREE if (SMP && ARM_LPAE)
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UID16
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 42c090cf0292..2c6f5ec3a55e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -127,6 +127,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_KPROBES
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 887d3a7bb646..a6527d6c0426 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86_64
 	select ARCH_SUPPORTS_INT128
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select HAVE_ARCH_SOFT_DIRTY
+	select HAVE_SAFESETID
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE
 	select SWIOTLB
diff --git a/security/Kconfig b/security/Kconfig
index c4302067a3ad..7d9008ad5903 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -237,6 +237,7 @@ source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/loadpin/Kconfig
 source security/yama/Kconfig
+source security/safesetid/Kconfig
 
 source security/integrity/Kconfig
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..88209d827832 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)	+= safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)	+= safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..4ff82c7ed273
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,13 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        depends on HAVE_SAFESETID
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..e30ff06d8e07
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,334 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+/*
+ * TODO: Figuring out whether the current syscall number (saved on the kernel
+ * stack) is one of the set*uid syscalls is an operation that requires checking
+ * the number against arch-specific constants as seen below. The need for this
+ * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
+ * implement an arch-specific function that gets called from this file and
+ * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
+ * selected for architectures that implement the function? Any other ideas?
+ */
+static bool setuid_syscall(int num)
+{
+#ifdef CONFIG_X86_64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_ia32_setreuid32 ||
+	      num == __NR_ia32_setuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setuid32))
+		return false;
+#else
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setreuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setfsuid32 ||
+	      num == __NR_compat_setuid ||
+	      num == __NR_compat_setreuid ||
+	      num == __NR_compat_setfsuid ||
+	      num == __NR_compat_setresuid ||
+	      num == __NR_compat_setreuid32 ||
+	      num == __NR_compat_setresuid32 ||
+	      num == __NR_compat_setuid32 ||
+	      num == __NR_compat_setfsuid32))
+		return false;
+#else
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM
+	if (!(num == __NR_setreuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setfsuid32))
+		return false;
+#else
+	BUILD_BUG();
+#endif
+	return true;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      int audit)
+{
+	/* The current->mm check will fail if this is a kernel thread. */
+	if (cap == CAP_SETUID &&
+	    current->mm &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		/*
+		 * syscall_get_nr can theoretically return 0 or -1, but that
+		 * would signify that the syscall is being aborted due to a
+		 * signal, so we don't need to check for this case here.
+		 */
+		if (!(setuid_syscall(syscall_get_nr(current,
+						    current_pt_regs()))))
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			return -1;
+	}
+	return 0;
+}
+
+static void setuid_policy_warning(kuid_t parent, kuid_t child)
+{
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	setuid_policy_warning(parent, child);
+	return -1;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		spin_lock(&safesetid_whitelist_hashtable_spinlock);
+		hash_del_rcu(&entry->next);
+		spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist)
+		kfree(entry);
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+security_initcall(safesetid_security_init);
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..ff5fcf2c1b37
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
+		flush_safesetid_whitelist_entries();
+		return len;
+	}
+
+	/*
+	 * If we get to here, must be the case that file_entry->type equals
+	 * SAFESETID_WHITELIST_ADD
+	 */
+	ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+							 &child);
+	if (ret)
+		return ret;
+
+	ret = add_safesetid_whitelist_entry(parent, child);
+	if (ret)
+		return ret;
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.19.1.568.g152ad8e336-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-10-31 15:28 [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
@ 2018-10-31 21:02 ` Serge E. Hallyn
  2018-10-31 21:57   ` Kees Cook
  2018-11-01  6:07   ` Serge E. Hallyn
  2018-11-02 18:07 ` [PATCH] " Stephen Smalley
  1 sibling, 2 replies; 88+ messages in thread
From: Serge E. Hallyn @ 2018-10-31 21:02 UTC (permalink / raw)
  To: mortonm; +Cc: jmorris, serge, keescook, linux-security-module

Quoting mortonm@chromium.org (mortonm@chromium.org):
> From: Micah Morton <mortonm@chromium.org>
> 
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
> 
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> 
> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> code that likely needs improvement before being an acceptable approach.
> I'm specifically interested to see if there are better ideas for how
> this could be done.
> 
>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
>  Documentation/admin-guide/LSM/index.rst     |   1 +
>  arch/Kconfig                                |   5 +
>  arch/arm/Kconfig                            |   1 +
>  arch/arm64/Kconfig                          |   1 +
>  arch/x86/Kconfig                            |   1 +
>  security/Kconfig                            |   1 +
>  security/Makefile                           |   2 +
>  security/safesetid/Kconfig                  |  13 +
>  security/safesetid/Makefile                 |   7 +
>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
>  security/safesetid/lsm.h                    |  30 ++
>  security/safesetid/securityfs.c             | 189 +++++++++++
>  13 files changed, 679 insertions(+)
>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>  create mode 100644 security/safesetid/Kconfig
>  create mode 100644 security/safesetid/Makefile
>  create mode 100644 security/safesetid/lsm.c
>  create mode 100644 security/safesetid/lsm.h
>  create mode 100644 security/safesetid/securityfs.c
> 
> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> new file mode 100644
> index 000000000000..e7d072124424
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> @@ -0,0 +1,94 @@
> +=========
> +SafeSetID
> +=========
> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> +UID/GID transitions from a given UID/GID to only those approved by a
> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> +allowing a user to set up user namespace UID mappings.
> +
> +
> +Background
> +==========
> +In absence of file capabilities, processes spawned on a Linux system that need
> +to switch to a different user must be spawned with CAP_SETUID privileges.
> +CAP_SETUID is granted to programs running as root or those running as a non-root
> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> +often preferable to use Linux runtime capabilities rather than file
> +capabilities, since using file capabilities to run a program with elevated
> +privileges opens up possible security holes since any user with access to the
> +file can exec() that program to gain the elevated privileges.

Not true, see inheritable capabilities.  You also might look at ambient
capabilities.

Just to be sure - your end-goal is to have a set of tasks which have
some privileges, including CAP_SETUID, but which cannot transition to
certain uids, perhaps including root?

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-10-31 21:02 ` Serge E. Hallyn
@ 2018-10-31 21:57   ` Kees Cook
  2018-10-31 22:37     ` Casey Schaufler
  2018-11-01  6:07   ` Serge E. Hallyn
  1 sibling, 1 reply; 88+ messages in thread
From: Kees Cook @ 2018-10-31 21:57 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Micah Morton, James Morris, linux-security-module

On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
> Just to be sure - your end-goal is to have a set of tasks which have
> some privileges, including CAP_SETUID, but which cannot transition to
> certain uids, perhaps including root?

AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
_without_ CAP_SETUID and still allow whitelisted uid transitions.

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-10-31 21:57   ` Kees Cook
@ 2018-10-31 22:37     ` Casey Schaufler
  2018-11-01  1:12       ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2018-10-31 22:37 UTC (permalink / raw)
  To: Kees Cook, Serge E. Hallyn
  Cc: Micah Morton, James Morris, linux-security-module

On 10/31/2018 2:57 PM, Kees Cook wrote:
> On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
>> Just to be sure - your end-goal is to have a set of tasks which have
>> some privileges, including CAP_SETUID, but which cannot transition to
>> certain uids, perhaps including root?
> AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
> _without_ CAP_SETUID and still allow whitelisted uid transitions.

I don't like that thought at all at all. You need CAP_SETUID for
some transitions but not all. I can call setreuid() and restore
the saved UID to the effective UID. If this LSM works correctly
(I haven't examined it carefully yet) it should prevent restoring
the effective UID if there isn't an appropriate whitelist entry.

It also violates the "additional restriction" model of LSMs.

That has the potential to introduce a failure when a process tries
to give up privilege. If 0:1000 isn't on the whitelist but 1000:0
is Bad Things can happen. A SUID root program would be unable to
give up its privilege by going back to the real UID in this case.


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-10-31 22:37     ` Casey Schaufler
@ 2018-11-01  1:12       ` Micah Morton
  2018-11-01  6:13         ` Serge E. Hallyn
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2018-11-01  1:12 UTC (permalink / raw)
  To: casey; +Cc: Kees Cook, serge, jmorris, linux-security-module

On Wed, Oct 31, 2018 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 10/31/2018 2:57 PM, Kees Cook wrote:
> > On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
> >> Just to be sure - your end-goal is to have a set of tasks which have
> >> some privileges, including CAP_SETUID, but which cannot transition to
> >> certain uids, perhaps including root?

Correct, only whitelisted uids can be switched to. This only pertains
to CAP_SETUID, other capabilities are not affected.

> > AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
> > _without_ CAP_SETUID and still allow whitelisted uid transitions.

Kees is right that this LSM only pertains to a single capability:
CAP_SETUID (future work could tackle CAP_SETGID in the same fashion)
-- although the idea here is to put in per-user limitations on what a
process running as that user can do even when it _has_ CAP_SETUID. So
it doesn't grant any extra privileges to processes that don't have
CAP_SETUID, only restricts processes that _do_ have CAP_SETUID if the
user they are running under is restricted.

>
> I don't like that thought at all at all. You need CAP_SETUID for
> some transitions but not all. I can call setreuid() and restore
> the saved UID to the effective UID. If this LSM works correctly
> (I haven't examined it carefully yet) it should prevent restoring
> the effective UID if there isn't an appropriate whitelist entry.

Yep, thats how it works. The idea here is that you still need
CAP_SETUID for all transitions, regardless of whether whitelist
policies exist or not.

>
> It also violates the "additional restriction" model of LSMs.
>
> That has the potential to introduce a failure when a process tries
> to give up privilege. If 0:1000 isn't on the whitelist but 1000:0

As above, if a process drops CAP_SETUID it wouldn't be able to do any
transitions (if this is what you mean by give up privilege). The
whitelist is a one-way policy so if one wanted to restrict user 123
but let it switch to 456 and back, 2 policies would need to be added:
123 -> 456 and 456 -> 123.

> is Bad Things can happen. A SUID root program would be unable to
> give up its privilege by going back to the real UID in this case.
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-10-31 21:02 ` Serge E. Hallyn
  2018-10-31 21:57   ` Kees Cook
@ 2018-11-01  6:07   ` Serge E. Hallyn
  2018-11-01 16:11     ` Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: Serge E. Hallyn @ 2018-11-01  6:07 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: mortonm, jmorris, keescook, linux-security-module

On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> Quoting mortonm@chromium.org (mortonm@chromium.org):
> > From: Micah Morton <mortonm@chromium.org>
> > 
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> > 
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> > 
> > NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> > code that likely needs improvement before being an acceptable approach.
> > I'm specifically interested to see if there are better ideas for how
> > this could be done.
> > 
> >  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> >  Documentation/admin-guide/LSM/index.rst     |   1 +
> >  arch/Kconfig                                |   5 +
> >  arch/arm/Kconfig                            |   1 +
> >  arch/arm64/Kconfig                          |   1 +
> >  arch/x86/Kconfig                            |   1 +
> >  security/Kconfig                            |   1 +
> >  security/Makefile                           |   2 +
> >  security/safesetid/Kconfig                  |  13 +
> >  security/safesetid/Makefile                 |   7 +
> >  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> >  security/safesetid/lsm.h                    |  30 ++
> >  security/safesetid/securityfs.c             | 189 +++++++++++
> >  13 files changed, 679 insertions(+)
> >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >  create mode 100644 security/safesetid/Kconfig
> >  create mode 100644 security/safesetid/Makefile
> >  create mode 100644 security/safesetid/lsm.c
> >  create mode 100644 security/safesetid/lsm.h
> >  create mode 100644 security/safesetid/securityfs.c
> > 
> > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > new file mode 100644
> > index 000000000000..e7d072124424
> > --- /dev/null
> > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > @@ -0,0 +1,94 @@
> > +=========
> > +SafeSetID
> > +=========
> > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > +UID/GID transitions from a given UID/GID to only those approved by a
> > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > +allowing a user to set up user namespace UID mappings.
> > +
> > +
> > +Background
> > +==========
> > +In absence of file capabilities, processes spawned on a Linux system that need
> > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > +often preferable to use Linux runtime capabilities rather than file
> > +capabilities, since using file capabilities to run a program with elevated
> > +privileges opens up possible security holes since any user with access to the
> > +file can exec() that program to gain the elevated privileges.
> 
> Not true, see inheritable capabilities.  You also might look at ambient
> capabilities.

So for example with pam_cap.so you could have your N uids each be given
the desired pI, and assign the corrsponding fIs to the files they should
be able to exec with privilege.  No other uids will run those files with
privilege.  *1

Can you give some more details about exactly how you see SafeSetID being
used?

I'm still not quite clear on whether you want N completely unprivileged
uids to be used by some user (i.e. uid 1000), or whether one or more of
those should also have some privileged, or whether one of the uids might
or might not b uid 0.  Years ago I used to use N separate uids to
somewhat segragate workloads on my laptop, and I'd like my browser to
do something like that.  Is that the kind of uid switching you have
in mind?

-serge

*1 And maybe with one of the p9auth/factotem proposals out there you
could have a userspace daemon hand out the tokens for setuid, but that's
getting "out there" and probably derailing this conversation :)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01  1:12       ` Micah Morton
@ 2018-11-01  6:13         ` Serge E. Hallyn
  2018-11-01 15:39           ` Casey Schaufler
  0 siblings, 1 reply; 88+ messages in thread
From: Serge E. Hallyn @ 2018-11-01  6:13 UTC (permalink / raw)
  To: Micah Morton; +Cc: casey, Kees Cook, serge, jmorris, linux-security-module

On Wed, Oct 31, 2018 at 06:12:46PM -0700, Micah Morton wrote:
> On Wed, Oct 31, 2018 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >
> > On 10/31/2018 2:57 PM, Kees Cook wrote:
> > > On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
> > >> Just to be sure - your end-goal is to have a set of tasks which have
> > >> some privileges, including CAP_SETUID, but which cannot transition to
> > >> certain uids, perhaps including root?
> 
> Correct, only whitelisted uids can be switched to. This only pertains
> to CAP_SETUID, other capabilities are not affected.
> 
> > > AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
> > > _without_ CAP_SETUID and still allow whitelisted uid transitions.
> 
> Kees is right that this LSM only pertains to a single capability:
> CAP_SETUID (future work could tackle CAP_SETGID in the same fashion)
> -- although the idea here is to put in per-user limitations on what a
> process running as that user can do even when it _has_ CAP_SETUID. So
> it doesn't grant any extra privileges to processes that don't have
> CAP_SETUID, only restricts processes that _do_ have CAP_SETUID if the
> user they are running under is restricted.
> 
> >
> > I don't like that thought at all at all. You need CAP_SETUID for
> > some transitions but not all. I can call setreuid() and restore
> > the saved UID to the effective UID. If this LSM works correctly
> > (I haven't examined it carefully yet) it should prevent restoring
> > the effective UID if there isn't an appropriate whitelist entry.
> 
> Yep, thats how it works. The idea here is that you still need
> CAP_SETUID for all transitions, regardless of whether whitelist
> policies exist or not.
> 
> >
> > It also violates the "additional restriction" model of LSMs.

Does it, or does the fact that CAP_SETUID is still required in order
to change uids address that?

> > That has the potential to introduce a failure when a process tries
> > to give up privilege. If 0:1000 isn't on the whitelist but 1000:0
> 
> As above, if a process drops CAP_SETUID it wouldn't be able to do any
> transitions (if this is what you mean by give up privilege). The
> whitelist is a one-way policy so if one wanted to restrict user 123
> but let it switch to 456 and back, 2 policies would need to be added:
> 123 -> 456 and 456 -> 123.
> 
> > is Bad Things can happen. A SUID root program would be unable to
> > give up its privilege by going back to the real UID in this case.

Yes, this was the root cause of the "sendmail capabilities bug" - a
privileged daemon which could be made to run with slightly less
privilege in such a way that it failed to drop privilege, then continued
ot run with some privilege.

But the key trigger there was that an unprivileged task could prevent
the more privileged task from dropping its privilege.

Is that the case here?  It might be...  If one of the uid-restricted
tasks running with CAP_SETUID runs a filter over some malicious data
which forces it to run a program which intends to change its uid and
fails to detect that that failed.  It's not quite as cut-and-dried
though, and if we simply do not allow uid 0 to be in the set of uids,
that may prevent any such cases.

-serge

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01  6:13         ` Serge E. Hallyn
@ 2018-11-01 15:39           ` Casey Schaufler
  2018-11-01 15:56             ` Serge E. Hallyn
  2018-11-01 16:18             ` Micah Morton
  0 siblings, 2 replies; 88+ messages in thread
From: Casey Schaufler @ 2018-11-01 15:39 UTC (permalink / raw)
  To: Serge E. Hallyn, Micah Morton; +Cc: Kees Cook, jmorris, linux-security-module

On 10/31/2018 11:13 PM, Serge E. Hallyn wrote:
> On Wed, Oct 31, 2018 at 06:12:46PM -0700, Micah Morton wrote:
>> On Wed, Oct 31, 2018 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>> On 10/31/2018 2:57 PM, Kees Cook wrote:
>>>> On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
>>>>> Just to be sure - your end-goal is to have a set of tasks which have
>>>>> some privileges, including CAP_SETUID, but which cannot transition to
>>>>> certain uids, perhaps including root?
>> Correct, only whitelisted uids can be switched to. This only pertains
>> to CAP_SETUID, other capabilities are not affected.
>>
>>>> AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
>>>> _without_ CAP_SETUID and still allow whitelisted uid transitions.
>> Kees is right that this LSM only pertains to a single capability:
>> CAP_SETUID (future work could tackle CAP_SETGID in the same fashion)
>> -- although the idea here is to put in per-user limitations on what a
>> process running as that user can do even when it _has_ CAP_SETUID. So
>> it doesn't grant any extra privileges to processes that don't have
>> CAP_SETUID, only restricts processes that _do_ have CAP_SETUID if the
>> user they are running under is restricted.
>>
>>> I don't like that thought at all at all. You need CAP_SETUID for
>>> some transitions but not all. I can call setreuid() and restore
>>> the saved UID to the effective UID. If this LSM works correctly
>>> (I haven't examined it carefully yet) it should prevent restoring
>>> the effective UID if there isn't an appropriate whitelist entry.
>> Yep, thats how it works. The idea here is that you still need
>> CAP_SETUID for all transitions, regardless of whether whitelist
>> policies exist or not.
>>
>>> It also violates the "additional restriction" model of LSMs.
> Does it, or does the fact that CAP_SETUID is still required in order
> to change uids address that?

Yes, it does. Reading Kees' response had me a little concerned.

>>> That has the potential to introduce a failure when a process tries
>>> to give up privilege. If 0:1000 isn't on the whitelist but 1000:0
>> As above, if a process drops CAP_SETUID it wouldn't be able to do any
>> transitions (if this is what you mean by give up privilege). The
>> whitelist is a one-way policy so if one wanted to restrict user 123
>> but let it switch to 456 and back, 2 policies would need to be added:
>> 123 -> 456 and 456 -> 123.
>>
>>> is Bad Things can happen. A SUID root program would be unable to
>>> give up its privilege by going back to the real UID in this case.
> Yes, this was the root cause of the "sendmail capabilities bug"

I'm very familiar with that particular bug, as Bob Mende's
work to convert sendmail to using capabilities was done for
a project I owned. The blowback against all things security
was pretty intense.

>  - a
> privileged daemon which could be made to run with slightly less
> privilege in such a way that it failed to drop privilege, then continued
> ot run with some privilege.
>
> But the key trigger there was that an unprivileged task could prevent
> the more privileged task from dropping its privilege.
>
> Is that the case here?

I think it is reasonably safe to assume that there
are many instances of programs that don't handle errors
from setreuid() in the reset case. Without privilege
setreuid() can be used to swap effective and real UIDs.

>   It might be...  If one of the uid-restricted
> tasks running with CAP_SETUID runs a filter over some malicious data
> which forces it to run a program which intends to change its uid and
> fails to detect that that failed.  It's not quite as cut-and-dried
> though, and if we simply do not allow uid 0 to be in the set of uids,
> that may prevent any such cases.

Alas, UID 0 is not the only case we have to worry about.
If I run a program owned by tss (Trousers) with the setuid
bit set it will change the effective UID to tss. If this
program expects to switch effective UID back to me and
the SafeSetID whitelist prevents it, Bad Things may happen
even though no capabilities or root privilege where ever
involved.

It would be easy for an inexperienced or malicious admin to
include cschaufler:tss in the whitelist but miss on adding
tss:cschaufler.


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 15:39           ` Casey Schaufler
@ 2018-11-01 15:56             ` Serge E. Hallyn
  2018-11-01 16:18             ` Micah Morton
  1 sibling, 0 replies; 88+ messages in thread
From: Serge E. Hallyn @ 2018-11-01 15:56 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Serge E. Hallyn, Micah Morton, Kees Cook, jmorris, linux-security-module

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 10/31/2018 11:13 PM, Serge E. Hallyn wrote:
> > On Wed, Oct 31, 2018 at 06:12:46PM -0700, Micah Morton wrote:
> >> On Wed, Oct 31, 2018 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>> On 10/31/2018 2:57 PM, Kees Cook wrote:
> >>>> On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
> >>>>> Just to be sure - your end-goal is to have a set of tasks which have
> >>>>> some privileges, including CAP_SETUID, but which cannot transition to
> >>>>> certain uids, perhaps including root?
> >> Correct, only whitelisted uids can be switched to. This only pertains
> >> to CAP_SETUID, other capabilities are not affected.
> >>
> >>>> AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
> >>>> _without_ CAP_SETUID and still allow whitelisted uid transitions.
> >> Kees is right that this LSM only pertains to a single capability:
> >> CAP_SETUID (future work could tackle CAP_SETGID in the same fashion)
> >> -- although the idea here is to put in per-user limitations on what a
> >> process running as that user can do even when it _has_ CAP_SETUID. So
> >> it doesn't grant any extra privileges to processes that don't have
> >> CAP_SETUID, only restricts processes that _do_ have CAP_SETUID if the
> >> user they are running under is restricted.
> >>
> >>> I don't like that thought at all at all. You need CAP_SETUID for
> >>> some transitions but not all. I can call setreuid() and restore
> >>> the saved UID to the effective UID. If this LSM works correctly
> >>> (I haven't examined it carefully yet) it should prevent restoring
> >>> the effective UID if there isn't an appropriate whitelist entry.
> >> Yep, thats how it works. The idea here is that you still need
> >> CAP_SETUID for all transitions, regardless of whether whitelist
> >> policies exist or not.
> >>
> >>> It also violates the "additional restriction" model of LSMs.
> > Does it, or does the fact that CAP_SETUID is still required in order
> > to change uids address that?
> 
> Yes, it does. Reading Kees' response had me a little concerned.
> 
> >>> That has the potential to introduce a failure when a process tries
> >>> to give up privilege. If 0:1000 isn't on the whitelist but 1000:0
> >> As above, if a process drops CAP_SETUID it wouldn't be able to do any
> >> transitions (if this is what you mean by give up privilege). The
> >> whitelist is a one-way policy so if one wanted to restrict user 123
> >> but let it switch to 456 and back, 2 policies would need to be added:
> >> 123 -> 456 and 456 -> 123.
> >>
> >>> is Bad Things can happen. A SUID root program would be unable to
> >>> give up its privilege by going back to the real UID in this case.
> > Yes, this was the root cause of the "sendmail capabilities bug"
> 
> I'm very familiar with that particular bug, as Bob Mende's
> work to convert sendmail to using capabilities was done for
> a project I owned. The blowback against all things security
> was pretty intense.
> 
> >  - a
> > privileged daemon which could be made to run with slightly less
> > privilege in such a way that it failed to drop privilege, then continued
> > ot run with some privilege.
> >
> > But the key trigger there was that an unprivileged task could prevent
> > the more privileged task from dropping its privilege.
> >
> > Is that the case here?
> 
> I think it is reasonably safe to assume that there
> are many instances of programs that don't handle errors
> from setreuid() in the reset case. Without privilege
> setreuid() can be used to swap effective and real UIDs.
> 
> >   It might be...  If one of the uid-restricted
> > tasks running with CAP_SETUID runs a filter over some malicious data
> > which forces it to run a program which intends to change its uid and
> > fails to detect that that failed.  It's not quite as cut-and-dried
> > though, and if we simply do not allow uid 0 to be in the set of uids,
> > that may prevent any such cases.
> 
> Alas, UID 0 is not the only case we have to worry about.
> If I run a program owned by tss (Trousers) with the setuid
> bit set it will change the effective UID to tss. If this
> program expects to switch effective UID back to me and
> the SafeSetID whitelist prevents it, Bad Things may happen
> even though no capabilities or root privilege where ever
> involved.

Yes, but I don't think an unprivileged user can make that happen.
If you look at the patch, you require cap_sys_admin againt your
user namespace in order to limit the uid range.  So either you
were privileged to begin with, or you create a new user namespace.
If you create a new userns, you can only map uids which are delegated
to you - presumably not tss - into that namespace.

> It would be easy for an inexperienced or malicious admin to
> include cschaufler:tss in the whitelist but miss on adding
> tss:cschaufler.

Well, it's also pretty easy for an admin to add 0 or tss into
serge's delegated mappings in /etc/subuid, I suppose...

Now I hadn't noticed the one-way directional nature of these
whitelist entries.  I'd been asuming there was just a set of
ids it was allowed to transition to it.  Not sure which is
better, I can see pros/cons to both.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01  6:07   ` Serge E. Hallyn
@ 2018-11-01 16:11     ` Micah Morton
  2018-11-01 16:22       ` Micah Morton
                         ` (3 more replies)
  0 siblings, 4 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-01 16:11 UTC (permalink / raw)
  To: serge; +Cc: jmorris, Kees Cook, linux-security-module

On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>
> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> > Quoting mortonm@chromium.org (mortonm@chromium.org):
> > > From: Micah Morton <mortonm@chromium.org>
> > >
> > > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > > transitions from a given UID/GID to only those approved by a
> > > system-wide whitelist. These restrictions also prohibit the given
> > > UIDs/GIDs from obtaining auxiliary privileges associated with
> > > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > > mappings. For now, only gating the set*uid family of syscalls is
> > > supported, with support for set*gid coming in a future patch set.
> > >
> > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > ---
> > >
> > > NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> > > code that likely needs improvement before being an acceptable approach.
> > > I'm specifically interested to see if there are better ideas for how
> > > this could be done.
> > >
> > >  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> > >  Documentation/admin-guide/LSM/index.rst     |   1 +
> > >  arch/Kconfig                                |   5 +
> > >  arch/arm/Kconfig                            |   1 +
> > >  arch/arm64/Kconfig                          |   1 +
> > >  arch/x86/Kconfig                            |   1 +
> > >  security/Kconfig                            |   1 +
> > >  security/Makefile                           |   2 +
> > >  security/safesetid/Kconfig                  |  13 +
> > >  security/safesetid/Makefile                 |   7 +
> > >  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> > >  security/safesetid/lsm.h                    |  30 ++
> > >  security/safesetid/securityfs.c             | 189 +++++++++++
> > >  13 files changed, 679 insertions(+)
> > >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> > >  create mode 100644 security/safesetid/Kconfig
> > >  create mode 100644 security/safesetid/Makefile
> > >  create mode 100644 security/safesetid/lsm.c
> > >  create mode 100644 security/safesetid/lsm.h
> > >  create mode 100644 security/safesetid/securityfs.c
> > >
> > > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > new file mode 100644
> > > index 000000000000..e7d072124424
> > > --- /dev/null
> > > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > @@ -0,0 +1,94 @@
> > > +=========
> > > +SafeSetID
> > > +=========
> > > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > > +UID/GID transitions from a given UID/GID to only those approved by a
> > > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > > +allowing a user to set up user namespace UID mappings.
> > > +
> > > +
> > > +Background
> > > +==========
> > > +In absence of file capabilities, processes spawned on a Linux system that need
> > > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > > +often preferable to use Linux runtime capabilities rather than file
> > > +capabilities, since using file capabilities to run a program with elevated
> > > +privileges opens up possible security holes since any user with access to the
> > > +file can exec() that program to gain the elevated privileges.
> >
> > Not true, see inheritable capabilities.  You also might look at ambient
> > capabilities.
>
> So for example with pam_cap.so you could have your N uids each be given
> the desired pI, and assign the corrsponding fIs to the files they should
> be able to exec with privilege.  No other uids will run those files with
> privilege.  *1

Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?

>
> Can you give some more details about exactly how you see SafeSetID being
> used?

Sure. The main use case for this LSM is to allow a non-root program to
transition to other untrusted uids without full blown CAP_SETUID
capabilities. The non-root program would still need CAP_SETUID to do
any kind of transition, but the additional restrictions imposed by
this LSM would mean it is a "safer" version of CAP_SETUID since the
non-root program cannot take advantage of CAP_SETUID to do any
unapproved actions (i.e. setuid to uid 0 or create/enter new user
namespace). The higher level goal is to allow for uid-based sandboxing
of system services without having to give out CAP_SETUID all over the
place just so that non-root programs can drop to
even-further-non-privileged uids. This is especially relevant when one
non-root daemon on the system should be allowed to spawn other
processes as different uids, but its undesirable to give the daemon a
basically-root-equivalent CAP_SETUID.

>
> I'm still not quite clear on whether you want N completely unprivileged
> uids to be used by some user (i.e. uid 1000), or whether one or more of
> those should also have some privileged, or whether one of the uids might
> or might not b uid 0.  Years ago I used to use N separate uids to
> somewhat segragate workloads on my laptop, and I'd like my browser to
> do something like that.  Is that the kind of uid switching you have
> in mind?

"N completely unprivileged uids to be used by some user (i.e. uid
1000)" is the closest description of what this LSM has in mind. For
example, uid 123 is some system service that needs runtime
capabilities X, Y and Z and a bunch of DBus permissions associated
with uid 123, but also wants to spawn another program without any of
these capabilities/permissions. In this case we would like to avoid
giving the system service CAP_SETUID.

>
> -serge
>
> *1 And maybe with one of the p9auth/factotem proposals out there you
> could have a userspace daemon hand out the tokens for setuid, but that's
> getting "out there" and probably derailing this conversation :)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 15:39           ` Casey Schaufler
  2018-11-01 15:56             ` Serge E. Hallyn
@ 2018-11-01 16:18             ` Micah Morton
  1 sibling, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-01 16:18 UTC (permalink / raw)
  To: casey; +Cc: serge, Kees Cook, jmorris, linux-security-module

On Thu, Nov 1, 2018 at 8:39 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 10/31/2018 11:13 PM, Serge E. Hallyn wrote:
> > On Wed, Oct 31, 2018 at 06:12:46PM -0700, Micah Morton wrote:
> >> On Wed, Oct 31, 2018 at 3:37 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>> On 10/31/2018 2:57 PM, Kees Cook wrote:
> >>>> On Wed, Oct 31, 2018 at 2:02 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
> >>>>> Just to be sure - your end-goal is to have a set of tasks which have
> >>>>> some privileges, including CAP_SETUID, but which cannot transition to
> >>>>> certain uids, perhaps including root?
> >> Correct, only whitelisted uids can be switched to. This only pertains
> >> to CAP_SETUID, other capabilities are not affected.
> >>
> >>>> AIUI, the issue is that CAP_SETUID is TOO permissive. Instead, run
> >>>> _without_ CAP_SETUID and still allow whitelisted uid transitions.
> >> Kees is right that this LSM only pertains to a single capability:
> >> CAP_SETUID (future work could tackle CAP_SETGID in the same fashion)
> >> -- although the idea here is to put in per-user limitations on what a
> >> process running as that user can do even when it _has_ CAP_SETUID. So
> >> it doesn't grant any extra privileges to processes that don't have
> >> CAP_SETUID, only restricts processes that _do_ have CAP_SETUID if the
> >> user they are running under is restricted.
> >>
> >>> I don't like that thought at all at all. You need CAP_SETUID for
> >>> some transitions but not all. I can call setreuid() and restore
> >>> the saved UID to the effective UID. If this LSM works correctly
> >>> (I haven't examined it carefully yet) it should prevent restoring
> >>> the effective UID if there isn't an appropriate whitelist entry.
> >> Yep, thats how it works. The idea here is that you still need
> >> CAP_SETUID for all transitions, regardless of whether whitelist
> >> policies exist or not.
> >>
> >>> It also violates the "additional restriction" model of LSMs.
> > Does it, or does the fact that CAP_SETUID is still required in order
> > to change uids address that?
>
> Yes, it does. Reading Kees' response had me a little concerned.
>
> >>> That has the potential to introduce a failure when a process tries
> >>> to give up privilege. If 0:1000 isn't on the whitelist but 1000:0
> >> As above, if a process drops CAP_SETUID it wouldn't be able to do any
> >> transitions (if this is what you mean by give up privilege). The
> >> whitelist is a one-way policy so if one wanted to restrict user 123
> >> but let it switch to 456 and back, 2 policies would need to be added:
> >> 123 -> 456 and 456 -> 123.
> >>
> >>> is Bad Things can happen. A SUID root program would be unable to
> >>> give up its privilege by going back to the real UID in this case.
> > Yes, this was the root cause of the "sendmail capabilities bug"
>
> I'm very familiar with that particular bug, as Bob Mende's
> work to convert sendmail to using capabilities was done for
> a project I owned. The blowback against all things security
> was pretty intense.
>
> >  - a
> > privileged daemon which could be made to run with slightly less
> > privilege in such a way that it failed to drop privilege, then continued
> > ot run with some privilege.
> >
> > But the key trigger there was that an unprivileged task could prevent
> > the more privileged task from dropping its privilege.
> >
> > Is that the case here?
>
> I think it is reasonably safe to assume that there
> are many instances of programs that don't handle errors
> from setreuid() in the reset case. Without privilege
> setreuid() can be used to swap effective and real UIDs.

This LSM won't interfere with any of the one-off transitions allowed
by the set*uid family of syscalls that don't require CAP_SETUID. See
safesetid_task_fix_setuid in lsm.c.

>
> >   It might be...  If one of the uid-restricted
> > tasks running with CAP_SETUID runs a filter over some malicious data
> > which forces it to run a program which intends to change its uid and
> > fails to detect that that failed.  It's not quite as cut-and-dried
> > though, and if we simply do not allow uid 0 to be in the set of uids,
> > that may prevent any such cases.
>
> Alas, UID 0 is not the only case we have to worry about.
> If I run a program owned by tss (Trousers) with the setuid
> bit set it will change the effective UID to tss. If this
> program expects to switch effective UID back to me and
> the SafeSetID whitelist prevents it, Bad Things may happen
> even though no capabilities or root privilege where ever
> involved.
>
> It would be easy for an inexperienced or malicious admin to
> include cschaufler:tss in the whitelist but miss on adding
> tss:cschaufler.
>

Same as above, this LSM will only affect transitions that would need
CAP_SETUID. AFAICT switching the effective UID back after that
setuid-bit scenario is not something that requires CAP_SETUID, and
thus would continue to work as it always has in Linux.

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 16:11     ` Micah Morton
@ 2018-11-01 16:22       ` Micah Morton
  2018-11-01 16:41       ` Micah Morton
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-01 16:22 UTC (permalink / raw)
  To: serge; +Cc: jmorris, Kees Cook, linux-security-module

On Thu, Nov 1, 2018 at 9:11 AM Micah Morton <mortonm@chromium.org> wrote:
>
> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >
> > On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> > > Quoting mortonm@chromium.org (mortonm@chromium.org):
> > > > From: Micah Morton <mortonm@chromium.org>
> > > >
> > > > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > > > transitions from a given UID/GID to only those approved by a
> > > > system-wide whitelist. These restrictions also prohibit the given
> > > > UIDs/GIDs from obtaining auxiliary privileges associated with
> > > > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > > > mappings. For now, only gating the set*uid family of syscalls is
> > > > supported, with support for set*gid coming in a future patch set.
> > > >
> > > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > > ---
> > > >
> > > > NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> > > > code that likely needs improvement before being an acceptable approach.
> > > > I'm specifically interested to see if there are better ideas for how
> > > > this could be done.
> > > >
> > > >  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> > > >  Documentation/admin-guide/LSM/index.rst     |   1 +
> > > >  arch/Kconfig                                |   5 +
> > > >  arch/arm/Kconfig                            |   1 +
> > > >  arch/arm64/Kconfig                          |   1 +
> > > >  arch/x86/Kconfig                            |   1 +
> > > >  security/Kconfig                            |   1 +
> > > >  security/Makefile                           |   2 +
> > > >  security/safesetid/Kconfig                  |  13 +
> > > >  security/safesetid/Makefile                 |   7 +
> > > >  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> > > >  security/safesetid/lsm.h                    |  30 ++
> > > >  security/safesetid/securityfs.c             | 189 +++++++++++
> > > >  13 files changed, 679 insertions(+)
> > > >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> > > >  create mode 100644 security/safesetid/Kconfig
> > > >  create mode 100644 security/safesetid/Makefile
> > > >  create mode 100644 security/safesetid/lsm.c
> > > >  create mode 100644 security/safesetid/lsm.h
> > > >  create mode 100644 security/safesetid/securityfs.c
> > > >
> > > > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > > new file mode 100644
> > > > index 000000000000..e7d072124424
> > > > --- /dev/null
> > > > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > > @@ -0,0 +1,94 @@
> > > > +=========
> > > > +SafeSetID
> > > > +=========
> > > > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > > > +UID/GID transitions from a given UID/GID to only those approved by a
> > > > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > > > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > > > +allowing a user to set up user namespace UID mappings.
> > > > +
> > > > +
> > > > +Background
> > > > +==========
> > > > +In absence of file capabilities, processes spawned on a Linux system that need
> > > > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > > > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > > > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > > > +often preferable to use Linux runtime capabilities rather than file
> > > > +capabilities, since using file capabilities to run a program with elevated
> > > > +privileges opens up possible security holes since any user with access to the
> > > > +file can exec() that program to gain the elevated privileges.
> > >
> > > Not true, see inheritable capabilities.  You also might look at ambient
> > > capabilities.
> >
> > So for example with pam_cap.so you could have your N uids each be given
> > the desired pI, and assign the corrsponding fIs to the files they should
> > be able to exec with privilege.  No other uids will run those files with
> > privilege.  *1
>
> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
>
> >
> > Can you give some more details about exactly how you see SafeSetID being
> > used?
>
> Sure. The main use case for this LSM is to allow a non-root program to
> transition to other untrusted uids without full blown CAP_SETUID
> capabilities. The non-root program would still need CAP_SETUID to do
> any kind of transition, but the additional restrictions imposed by
> this LSM would mean it is a "safer" version of CAP_SETUID since the
> non-root program cannot take advantage of CAP_SETUID to do any
> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> namespace). The higher level goal is to allow for uid-based sandboxing
> of system services without having to give out CAP_SETUID all over the
> place just so that non-root programs can drop to
> even-further-non-privileged uids. This is especially relevant when one
> non-root daemon on the system should be allowed to spawn other
> processes as different uids, but its undesirable to give the daemon a
> basically-root-equivalent CAP_SETUID.
>
> >
> > I'm still not quite clear on whether you want N completely unprivileged
> > uids to be used by some user (i.e. uid 1000), or whether one or more of
> > those should also have some privileged, or whether one of the uids might
> > or might not b uid 0.  Years ago I used to use N separate uids to
> > somewhat segragate workloads on my laptop, and I'd like my browser to
> > do something like that.  Is that the kind of uid switching you have
> > in mind?
>
> "N completely unprivileged uids to be used by some user (i.e. uid
> 1000)" is the closest description of what this LSM has in mind. For
> example, uid 123 is some system service that needs runtime
> capabilities X, Y and Z and a bunch of DBus permissions associated
> with uid 123, but also wants to spawn another program without any of
> these capabilities/permissions. In this case we would like to avoid
> giving the system service CAP_SETUID.

To clarify: "spawn another program *as a different uid* without any..."

>
> >
> > -serge
> >
> > *1 And maybe with one of the p9auth/factotem proposals out there you
> > could have a userspace daemon hand out the tokens for setuid, but that's
> > getting "out there" and probably derailing this conversation :)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 16:11     ` Micah Morton
  2018-11-01 16:22       ` Micah Morton
@ 2018-11-01 16:41       ` Micah Morton
  2018-11-01 17:08       ` Casey Schaufler
  2018-11-06 20:59       ` [PATCH] " James Morris
  3 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-01 16:41 UTC (permalink / raw)
  To: serge; +Cc: jmorris, Kees Cook, linux-security-module

On Thu, Nov 1, 2018 at 9:11 AM Micah Morton <mortonm@chromium.org> wrote:
>
> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >
> > On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> > > Quoting mortonm@chromium.org (mortonm@chromium.org):
> > > > From: Micah Morton <mortonm@chromium.org>
> > > >
> > > > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > > > transitions from a given UID/GID to only those approved by a
> > > > system-wide whitelist. These restrictions also prohibit the given
> > > > UIDs/GIDs from obtaining auxiliary privileges associated with
> > > > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > > > mappings. For now, only gating the set*uid family of syscalls is
> > > > supported, with support for set*gid coming in a future patch set.
> > > >
> > > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > > ---
> > > >
> > > > NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> > > > code that likely needs improvement before being an acceptable approach.
> > > > I'm specifically interested to see if there are better ideas for how
> > > > this could be done.
> > > >
> > > >  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> > > >  Documentation/admin-guide/LSM/index.rst     |   1 +
> > > >  arch/Kconfig                                |   5 +
> > > >  arch/arm/Kconfig                            |   1 +
> > > >  arch/arm64/Kconfig                          |   1 +
> > > >  arch/x86/Kconfig                            |   1 +
> > > >  security/Kconfig                            |   1 +
> > > >  security/Makefile                           |   2 +
> > > >  security/safesetid/Kconfig                  |  13 +
> > > >  security/safesetid/Makefile                 |   7 +
> > > >  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> > > >  security/safesetid/lsm.h                    |  30 ++
> > > >  security/safesetid/securityfs.c             | 189 +++++++++++
> > > >  13 files changed, 679 insertions(+)
> > > >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> > > >  create mode 100644 security/safesetid/Kconfig
> > > >  create mode 100644 security/safesetid/Makefile
> > > >  create mode 100644 security/safesetid/lsm.c
> > > >  create mode 100644 security/safesetid/lsm.h
> > > >  create mode 100644 security/safesetid/securityfs.c
> > > >
> > > > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > > new file mode 100644
> > > > index 000000000000..e7d072124424
> > > > --- /dev/null
> > > > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > > @@ -0,0 +1,94 @@
> > > > +=========
> > > > +SafeSetID
> > > > +=========
> > > > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > > > +UID/GID transitions from a given UID/GID to only those approved by a
> > > > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > > > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > > > +allowing a user to set up user namespace UID mappings.
> > > > +
> > > > +
> > > > +Background
> > > > +==========
> > > > +In absence of file capabilities, processes spawned on a Linux system that need
> > > > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > > > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > > > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > > > +often preferable to use Linux runtime capabilities rather than file
> > > > +capabilities, since using file capabilities to run a program with elevated
> > > > +privileges opens up possible security holes since any user with access to the
> > > > +file can exec() that program to gain the elevated privileges.
> > >
> > > Not true, see inheritable capabilities.  You also might look at ambient
> > > capabilities.
> >
> > So for example with pam_cap.so you could have your N uids each be given
> > the desired pI, and assign the corrsponding fIs to the files they should
> > be able to exec with privilege.  No other uids will run those files with
> > privilege.  *1
>
> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
>
> >
> > Can you give some more details about exactly how you see SafeSetID being
> > used?
>
> Sure. The main use case for this LSM is to allow a non-root program to
> transition to other untrusted uids without full blown CAP_SETUID
> capabilities. The non-root program would still need CAP_SETUID to do
> any kind of transition, but the additional restrictions imposed by
> this LSM would mean it is a "safer" version of CAP_SETUID since the
> non-root program cannot take advantage of CAP_SETUID to do any
> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> namespace). The higher level goal is to allow for uid-based sandboxing
> of system services without having to give out CAP_SETUID all over the
> place just so that non-root programs can drop to
> even-further-non-privileged uids. This is especially relevant when one
> non-root daemon on the system should be allowed to spawn other
> processes as different uids, but its undesirable to give the daemon a
> basically-root-equivalent CAP_SETUID.
>
> >
> > I'm still not quite clear on whether you want N completely unprivileged
> > uids to be used by some user (i.e. uid 1000), or whether one or more of
> > those should also have some privileged, or whether one of the uids might
> > or might not b uid 0.  Years ago I used to use N separate uids to
> > somewhat segragate workloads on my laptop, and I'd like my browser to
> > do something like that.  Is that the kind of uid switching you have
> > in mind?
>
> "N completely unprivileged uids to be used by some user (i.e. uid
> 1000)" is the closest description of what this LSM has in mind. For
> example, uid 123 is some system service that needs runtime
> capabilities X, Y and Z and a bunch of DBus permissions associated
> with uid 123, but also wants to spawn another program without any of
> these capabilities/permissions. In this case we would like to avoid
> giving the system service CAP_SETUID.

Another clarification: "avoid giving the system service
_full/unrestricted_ CAP_SETUID capabilities"

>
> >
> > -serge
> >
> > *1 And maybe with one of the p9auth/factotem proposals out there you
> > could have a userspace daemon hand out the tokens for setuid, but that's
> > getting "out there" and probably derailing this conversation :)

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 16:11     ` Micah Morton
  2018-11-01 16:22       ` Micah Morton
  2018-11-01 16:41       ` Micah Morton
@ 2018-11-01 17:08       ` Casey Schaufler
  2018-11-01 19:52         ` Micah Morton
  2018-11-06 20:59       ` [PATCH] " James Morris
  3 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2018-11-01 17:08 UTC (permalink / raw)
  To: Micah Morton, serge; +Cc: jmorris, Kees Cook, linux-security-module

On 11/1/2018 9:11 AM, Micah Morton wrote:
> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
>>> Quoting mortonm@chromium.org (mortonm@chromium.org):
>>>> From: Micah Morton <mortonm@chromium.org>
>>>>
>>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
>>>> transitions from a given UID/GID to only those approved by a
>>>> system-wide whitelist. These restrictions also prohibit the given
>>>> UIDs/GIDs from obtaining auxiliary privileges associated with
>>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
>>>> mappings. For now, only gating the set*uid family of syscalls is
>>>> supported, with support for set*gid coming in a future patch set.
>>>>
>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
>>>> ---
>>>>
>>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
>>>> code that likely needs improvement before being an acceptable approach.
>>>> I'm specifically interested to see if there are better ideas for how
>>>> this could be done.
>>>>
>>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
>>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
>>>>  arch/Kconfig                                |   5 +
>>>>  arch/arm/Kconfig                            |   1 +
>>>>  arch/arm64/Kconfig                          |   1 +
>>>>  arch/x86/Kconfig                            |   1 +
>>>>  security/Kconfig                            |   1 +
>>>>  security/Makefile                           |   2 +
>>>>  security/safesetid/Kconfig                  |  13 +
>>>>  security/safesetid/Makefile                 |   7 +
>>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
>>>>  security/safesetid/lsm.h                    |  30 ++
>>>>  security/safesetid/securityfs.c             | 189 +++++++++++
>>>>  13 files changed, 679 insertions(+)
>>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>>>>  create mode 100644 security/safesetid/Kconfig
>>>>  create mode 100644 security/safesetid/Makefile
>>>>  create mode 100644 security/safesetid/lsm.c
>>>>  create mode 100644 security/safesetid/lsm.h
>>>>  create mode 100644 security/safesetid/securityfs.c
>>>>
>>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
>>>> new file mode 100644
>>>> index 000000000000..e7d072124424
>>>> --- /dev/null
>>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
>>>> @@ -0,0 +1,94 @@
>>>> +=========
>>>> +SafeSetID
>>>> +=========
>>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
>>>> +UID/GID transitions from a given UID/GID to only those approved by a
>>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
>>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
>>>> +allowing a user to set up user namespace UID mappings.
>>>> +
>>>> +
>>>> +Background
>>>> +==========
>>>> +In absence of file capabilities, processes spawned on a Linux system that need
>>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
>>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
>>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
>>>> +often preferable to use Linux runtime capabilities rather than file
>>>> +capabilities, since using file capabilities to run a program with elevated
>>>> +privileges opens up possible security holes since any user with access to the
>>>> +file can exec() that program to gain the elevated privileges.
>>> Not true, see inheritable capabilities.  You also might look at ambient
>>> capabilities.
>> So for example with pam_cap.so you could have your N uids each be given
>> the desired pI, and assign the corrsponding fIs to the files they should
>> be able to exec with privilege.  No other uids will run those files with
>> privilege.  *1
> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
>
>> Can you give some more details about exactly how you see SafeSetID being
>> used?
> Sure. The main use case for this LSM is to allow a non-root program to
> transition to other untrusted uids without full blown CAP_SETUID
> capabilities. The non-root program would still need CAP_SETUID to do
> any kind of transition, but the additional restrictions imposed by
> this LSM would mean it is a "safer" version of CAP_SETUID since the
> non-root program cannot take advantage of CAP_SETUID to do any
> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> namespace). The higher level goal is to allow for uid-based sandboxing
> of system services without having to give out CAP_SETUID all over the
> place just so that non-root programs can drop to
> even-further-non-privileged uids. This is especially relevant when one
> non-root daemon on the system should be allowed to spawn other
> processes as different uids, but its undesirable to give the daemon a
> basically-root-equivalent CAP_SETUID.

I don't want to sound stupid(er than usual), but it sounds like
you could do all this using setuid bits prudently. Based on this
description, I don't see that anything new is needed.

>> I'm still not quite clear on whether you want N completely unprivileged
>> uids to be used by some user (i.e. uid 1000), or whether one or more of
>> those should also have some privileged, or whether one of the uids might
>> or might not b uid 0.  Years ago I used to use N separate uids to
>> somewhat segragate workloads on my laptop, and I'd like my browser to
>> do something like that.  Is that the kind of uid switching you have
>> in mind?
> "N completely unprivileged uids to be used by some user (i.e. uid
> 1000)" is the closest description of what this LSM has in mind. For
> example, uid 123 is some system service that needs runtime
> capabilities X, Y and Z and a bunch of DBus permissions associated
> with uid 123, but also wants to spawn another program without any of
> these capabilities/permissions. In this case we would like to avoid
> giving the system service CAP_SETUID.
>
>> -serge
>>
>> *1 And maybe with one of the p9auth/factotem proposals out there you
>> could have a userspace daemon hand out the tokens for setuid, but that's
>> getting "out there" and probably derailing this conversation :)


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 17:08       ` Casey Schaufler
@ 2018-11-01 19:52         ` Micah Morton
  2018-11-02 16:05           ` Casey Schaufler
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2018-11-01 19:52 UTC (permalink / raw)
  To: casey; +Cc: serge, jmorris, Kees Cook, linux-security-module

On Thu, Nov 1, 2018 at 10:08 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 11/1/2018 9:11 AM, Micah Morton wrote:
> > On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> >>> Quoting mortonm@chromium.org (mortonm@chromium.org):
> >>>> From: Micah Morton <mortonm@chromium.org>
> >>>>
> >>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
> >>>> transitions from a given UID/GID to only those approved by a
> >>>> system-wide whitelist. These restrictions also prohibit the given
> >>>> UIDs/GIDs from obtaining auxiliary privileges associated with
> >>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> >>>> mappings. For now, only gating the set*uid family of syscalls is
> >>>> supported, with support for set*gid coming in a future patch set.
> >>>>
> >>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >>>> ---
> >>>>
> >>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> >>>> code that likely needs improvement before being an acceptable approach.
> >>>> I'm specifically interested to see if there are better ideas for how
> >>>> this could be done.
> >>>>
> >>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> >>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
> >>>>  arch/Kconfig                                |   5 +
> >>>>  arch/arm/Kconfig                            |   1 +
> >>>>  arch/arm64/Kconfig                          |   1 +
> >>>>  arch/x86/Kconfig                            |   1 +
> >>>>  security/Kconfig                            |   1 +
> >>>>  security/Makefile                           |   2 +
> >>>>  security/safesetid/Kconfig                  |  13 +
> >>>>  security/safesetid/Makefile                 |   7 +
> >>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> >>>>  security/safesetid/lsm.h                    |  30 ++
> >>>>  security/safesetid/securityfs.c             | 189 +++++++++++
> >>>>  13 files changed, 679 insertions(+)
> >>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>  create mode 100644 security/safesetid/Kconfig
> >>>>  create mode 100644 security/safesetid/Makefile
> >>>>  create mode 100644 security/safesetid/lsm.c
> >>>>  create mode 100644 security/safesetid/lsm.h
> >>>>  create mode 100644 security/safesetid/securityfs.c
> >>>>
> >>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>> new file mode 100644
> >>>> index 000000000000..e7d072124424
> >>>> --- /dev/null
> >>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>> @@ -0,0 +1,94 @@
> >>>> +=========
> >>>> +SafeSetID
> >>>> +=========
> >>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> >>>> +UID/GID transitions from a given UID/GID to only those approved by a
> >>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> >>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> >>>> +allowing a user to set up user namespace UID mappings.
> >>>> +
> >>>> +
> >>>> +Background
> >>>> +==========
> >>>> +In absence of file capabilities, processes spawned on a Linux system that need
> >>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
> >>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
> >>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> >>>> +often preferable to use Linux runtime capabilities rather than file
> >>>> +capabilities, since using file capabilities to run a program with elevated
> >>>> +privileges opens up possible security holes since any user with access to the
> >>>> +file can exec() that program to gain the elevated privileges.
> >>> Not true, see inheritable capabilities.  You also might look at ambient
> >>> capabilities.
> >> So for example with pam_cap.so you could have your N uids each be given
> >> the desired pI, and assign the corrsponding fIs to the files they should
> >> be able to exec with privilege.  No other uids will run those files with
> >> privilege.  *1
> > Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
> >
> >> Can you give some more details about exactly how you see SafeSetID being
> >> used?
> > Sure. The main use case for this LSM is to allow a non-root program to
> > transition to other untrusted uids without full blown CAP_SETUID
> > capabilities. The non-root program would still need CAP_SETUID to do
> > any kind of transition, but the additional restrictions imposed by
> > this LSM would mean it is a "safer" version of CAP_SETUID since the
> > non-root program cannot take advantage of CAP_SETUID to do any
> > unapproved actions (i.e. setuid to uid 0 or create/enter new user
> > namespace). The higher level goal is to allow for uid-based sandboxing
> > of system services without having to give out CAP_SETUID all over the
> > place just so that non-root programs can drop to
> > even-further-non-privileged uids. This is especially relevant when one
> > non-root daemon on the system should be allowed to spawn other
> > processes as different uids, but its undesirable to give the daemon a
> > basically-root-equivalent CAP_SETUID.
>
> I don't want to sound stupid(er than usual), but it sounds like
> you could do all this using setuid bits prudently. Based on this
> description, I don't see that anything new is needed.

There are situations where setuid bits don't get the job done, as
there are many situations where a program just wants to call setuid as
part of its execution (or fork + setuid without exec), instead of
fork/exec'ing a setuid binary. Take the following scenario for
example: init script (as root) spawns a network manager program as uid
1000 and then the network manager spawns OpenVPN. The common mode of
operation for OpenVPN is to start running as the uid it was spawned
with (1000) at startup, but then drop to a lesser-privileged uid (e.g.
2000) after initialization/setup by calling setuid. This is something
setuid bits wouldn't help with, without refactoring OpenVPN. So one
option here is to give the network manager CAP_SETUID, which will be
inherited by OpenVPN, and then OpenVPN drops to uid 2000 and drops
CAP_SETUID (would probably require patching OpenVPN for the capability
dropping). The problem here is that if the network manager itself is
untrusted and exploitable, then giving it unrestricted CAP_SETUID is a
big security risk. Even just sticking with the network manager / VPN
example, strongSwan VPN also uses the same drop-to-user-through-setuid
setup, as do other Linux applications. Refactoring these applications
to fork/exec setuid binaries instead of simply calling setuid is often
infeasible. So a direct call to setuid is often necessary/expected,
and setuid bits don't help here.

Also, use of setuid bits precludes the use of the no_new_privs bit,
which is usually at least a nice-to-have (if not need-to-have) for
sandboxed processes on the system.

>
> >> I'm still not quite clear on whether you want N completely unprivileged
> >> uids to be used by some user (i.e. uid 1000), or whether one or more of
> >> those should also have some privileged, or whether one of the uids might
> >> or might not b uid 0.  Years ago I used to use N separate uids to
> >> somewhat segragate workloads on my laptop, and I'd like my browser to
> >> do something like that.  Is that the kind of uid switching you have
> >> in mind?
> > "N completely unprivileged uids to be used by some user (i.e. uid
> > 1000)" is the closest description of what this LSM has in mind. For
> > example, uid 123 is some system service that needs runtime
> > capabilities X, Y and Z and a bunch of DBus permissions associated
> > with uid 123, but also wants to spawn another program without any of
> > these capabilities/permissions. In this case we would like to avoid
> > giving the system service CAP_SETUID.
> >
> >> -serge
> >>
> >> *1 And maybe with one of the p9auth/factotem proposals out there you
> >> could have a userspace daemon hand out the tokens for setuid, but that's
> >> getting "out there" and probably derailing this conversation :)
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 19:52         ` Micah Morton
@ 2018-11-02 16:05           ` Casey Schaufler
  2018-11-02 17:12             ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2018-11-02 16:05 UTC (permalink / raw)
  To: Micah Morton
  Cc: serge, jmorris, Kees Cook, linux-security-module, Casey Schaufler

On 11/1/2018 12:52 PM, Micah Morton wrote:
> On Thu, Nov 1, 2018 at 10:08 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 11/1/2018 9:11 AM, Micah Morton wrote:
>>> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>>>> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
>>>>> Quoting mortonm@chromium.org (mortonm@chromium.org):
>>>>>> From: Micah Morton <mortonm@chromium.org>
>>>>>>
>>>>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
>>>>>> transitions from a given UID/GID to only those approved by a
>>>>>> system-wide whitelist. These restrictions also prohibit the given
>>>>>> UIDs/GIDs from obtaining auxiliary privileges associated with
>>>>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
>>>>>> mappings. For now, only gating the set*uid family of syscalls is
>>>>>> supported, with support for set*gid coming in a future patch set.
>>>>>>
>>>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
>>>>>> ---
>>>>>>
>>>>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
>>>>>> code that likely needs improvement before being an acceptable approach.
>>>>>> I'm specifically interested to see if there are better ideas for how
>>>>>> this could be done.
>>>>>>
>>>>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
>>>>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
>>>>>>  arch/Kconfig                                |   5 +
>>>>>>  arch/arm/Kconfig                            |   1 +
>>>>>>  arch/arm64/Kconfig                          |   1 +
>>>>>>  arch/x86/Kconfig                            |   1 +
>>>>>>  security/Kconfig                            |   1 +
>>>>>>  security/Makefile                           |   2 +
>>>>>>  security/safesetid/Kconfig                  |  13 +
>>>>>>  security/safesetid/Makefile                 |   7 +
>>>>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
>>>>>>  security/safesetid/lsm.h                    |  30 ++
>>>>>>  security/safesetid/securityfs.c             | 189 +++++++++++
>>>>>>  13 files changed, 679 insertions(+)
>>>>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>>>>>>  create mode 100644 security/safesetid/Kconfig
>>>>>>  create mode 100644 security/safesetid/Makefile
>>>>>>  create mode 100644 security/safesetid/lsm.c
>>>>>>  create mode 100644 security/safesetid/lsm.h
>>>>>>  create mode 100644 security/safesetid/securityfs.c
>>>>>>
>>>>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
>>>>>> new file mode 100644
>>>>>> index 000000000000..e7d072124424
>>>>>> --- /dev/null
>>>>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
>>>>>> @@ -0,0 +1,94 @@
>>>>>> +=========
>>>>>> +SafeSetID
>>>>>> +=========
>>>>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
>>>>>> +UID/GID transitions from a given UID/GID to only those approved by a
>>>>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
>>>>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
>>>>>> +allowing a user to set up user namespace UID mappings.
>>>>>> +
>>>>>> +
>>>>>> +Background
>>>>>> +==========
>>>>>> +In absence of file capabilities, processes spawned on a Linux system that need
>>>>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
>>>>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
>>>>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
>>>>>> +often preferable to use Linux runtime capabilities rather than file
>>>>>> +capabilities, since using file capabilities to run a program with elevated
>>>>>> +privileges opens up possible security holes since any user with access to the
>>>>>> +file can exec() that program to gain the elevated privileges.
>>>>> Not true, see inheritable capabilities.  You also might look at ambient
>>>>> capabilities.
>>>> So for example with pam_cap.so you could have your N uids each be given
>>>> the desired pI, and assign the corrsponding fIs to the files they should
>>>> be able to exec with privilege.  No other uids will run those files with
>>>> privilege.  *1
>>> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
>>>
>>>> Can you give some more details about exactly how you see SafeSetID being
>>>> used?
>>> Sure. The main use case for this LSM is to allow a non-root program to
>>> transition to other untrusted uids without full blown CAP_SETUID
>>> capabilities. The non-root program would still need CAP_SETUID to do
>>> any kind of transition, but the additional restrictions imposed by
>>> this LSM would mean it is a "safer" version of CAP_SETUID since the
>>> non-root program cannot take advantage of CAP_SETUID to do any
>>> unapproved actions (i.e. setuid to uid 0 or create/enter new user
>>> namespace). The higher level goal is to allow for uid-based sandboxing
>>> of system services without having to give out CAP_SETUID all over the
>>> place just so that non-root programs can drop to
>>> even-further-non-privileged uids. This is especially relevant when one
>>> non-root daemon on the system should be allowed to spawn other
>>> processes as different uids, but its undesirable to give the daemon a
>>> basically-root-equivalent CAP_SETUID.
>> I don't want to sound stupid(er than usual), but it sounds like
>> you could do all this using setuid bits prudently. Based on this
>> description, I don't see that anything new is needed.
> There are situations where setuid bits don't get the job done, as
> there are many situations where a program just wants to call setuid as
> part of its execution (or fork + setuid without exec), instead of
> fork/exec'ing a setuid binary.

Yes, I understand that.

> Take the following scenario for
> example: init script (as root) spawns a network manager program as uid
> 1000

So far, so good.

> and then the network manager spawns OpenVPN. The common mode of
> operation for OpenVPN is to start running as the uid it was spawned
> with (1000) at startup, but then drop to a lesser-privileged uid (e.g.
> 2000) after initialization/setup by calling setuid.

OK. That's an operation that does and ought to require privilege.

> This is something
> setuid bits wouldn't help with, without refactoring OpenVPN.

You're correct.

> So one
> option here is to give the network manager CAP_SETUID, which will be
> inherited by OpenVPN, and then OpenVPN drops to uid 2000 and drops
> CAP_SETUID (would probably require patching OpenVPN for the capability
> dropping).

Or, you put CAP_SETUID on the file capabilities for OpenVPN,
which is the way the P1003.1e DRAFT specification would have
you accomplish this. Unfortunately, with all the changes made
to capabilities for namespaces and all I'm not 100% sure I
could say exactly how to set that.

> The problem here is that if the network manager itself is
> untrusted and exploitable, then giving it unrestricted CAP_SETUID is a
> big security risk.

Right. That's why you set the file capabilities on OpenVPN.

> Even just sticking with the network manager / VPN
> example, strongSwan VPN also uses the same drop-to-user-through-setuid
> setup, as do other Linux applications.

Same solution.

> Refactoring these applications
> to fork/exec setuid binaries instead of simply calling setuid is often
> infeasible. So a direct call to setuid is often necessary/expected,
> and setuid bits don't help here.

What is it with kids these days, that they are so
afraid of fixing code that needs fixing? But that's
not necessary in this example.

> Also, use of setuid bits precludes the use of the no_new_privs bit,
> which is usually at least a nice-to-have (if not need-to-have) for
> sandboxed processes on the system.

But you've already said that you *want* to change the security state,
"drop to a lesser-privileged uid", so you're already mucking with the
sandbox. If you're going to say that changing UIDs doesn't count for
sandboxing I'll point out that you brought up the notion of a
lesser-privileged UID.


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 16:05           ` Casey Schaufler
@ 2018-11-02 17:12             ` Micah Morton
  2018-11-02 18:19               ` Casey Schaufler
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2018-11-02 17:12 UTC (permalink / raw)
  To: casey; +Cc: serge, jmorris, Kees Cook, linux-security-module

On Fri, Nov 2, 2018 at 9:05 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 11/1/2018 12:52 PM, Micah Morton wrote:
> > On Thu, Nov 1, 2018 at 10:08 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 11/1/2018 9:11 AM, Micah Morton wrote:
> >>> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >>>> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> >>>>> Quoting mortonm@chromium.org (mortonm@chromium.org):
> >>>>>> From: Micah Morton <mortonm@chromium.org>
> >>>>>>
> >>>>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
> >>>>>> transitions from a given UID/GID to only those approved by a
> >>>>>> system-wide whitelist. These restrictions also prohibit the given
> >>>>>> UIDs/GIDs from obtaining auxiliary privileges associated with
> >>>>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> >>>>>> mappings. For now, only gating the set*uid family of syscalls is
> >>>>>> supported, with support for set*gid coming in a future patch set.
> >>>>>>
> >>>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >>>>>> ---
> >>>>>>
> >>>>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> >>>>>> code that likely needs improvement before being an acceptable approach.
> >>>>>> I'm specifically interested to see if there are better ideas for how
> >>>>>> this could be done.
> >>>>>>
> >>>>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> >>>>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
> >>>>>>  arch/Kconfig                                |   5 +
> >>>>>>  arch/arm/Kconfig                            |   1 +
> >>>>>>  arch/arm64/Kconfig                          |   1 +
> >>>>>>  arch/x86/Kconfig                            |   1 +
> >>>>>>  security/Kconfig                            |   1 +
> >>>>>>  security/Makefile                           |   2 +
> >>>>>>  security/safesetid/Kconfig                  |  13 +
> >>>>>>  security/safesetid/Makefile                 |   7 +
> >>>>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> >>>>>>  security/safesetid/lsm.h                    |  30 ++
> >>>>>>  security/safesetid/securityfs.c             | 189 +++++++++++
> >>>>>>  13 files changed, 679 insertions(+)
> >>>>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>  create mode 100644 security/safesetid/Kconfig
> >>>>>>  create mode 100644 security/safesetid/Makefile
> >>>>>>  create mode 100644 security/safesetid/lsm.c
> >>>>>>  create mode 100644 security/safesetid/lsm.h
> >>>>>>  create mode 100644 security/safesetid/securityfs.c
> >>>>>>
> >>>>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>> new file mode 100644
> >>>>>> index 000000000000..e7d072124424
> >>>>>> --- /dev/null
> >>>>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>> @@ -0,0 +1,94 @@
> >>>>>> +=========
> >>>>>> +SafeSetID
> >>>>>> +=========
> >>>>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> >>>>>> +UID/GID transitions from a given UID/GID to only those approved by a
> >>>>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> >>>>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> >>>>>> +allowing a user to set up user namespace UID mappings.
> >>>>>> +
> >>>>>> +
> >>>>>> +Background
> >>>>>> +==========
> >>>>>> +In absence of file capabilities, processes spawned on a Linux system that need
> >>>>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
> >>>>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
> >>>>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> >>>>>> +often preferable to use Linux runtime capabilities rather than file
> >>>>>> +capabilities, since using file capabilities to run a program with elevated
> >>>>>> +privileges opens up possible security holes since any user with access to the
> >>>>>> +file can exec() that program to gain the elevated privileges.
> >>>>> Not true, see inheritable capabilities.  You also might look at ambient
> >>>>> capabilities.
> >>>> So for example with pam_cap.so you could have your N uids each be given
> >>>> the desired pI, and assign the corrsponding fIs to the files they should
> >>>> be able to exec with privilege.  No other uids will run those files with
> >>>> privilege.  *1
> >>> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
> >>>
> >>>> Can you give some more details about exactly how you see SafeSetID being
> >>>> used?
> >>> Sure. The main use case for this LSM is to allow a non-root program to
> >>> transition to other untrusted uids without full blown CAP_SETUID
> >>> capabilities. The non-root program would still need CAP_SETUID to do
> >>> any kind of transition, but the additional restrictions imposed by
> >>> this LSM would mean it is a "safer" version of CAP_SETUID since the
> >>> non-root program cannot take advantage of CAP_SETUID to do any
> >>> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> >>> namespace). The higher level goal is to allow for uid-based sandboxing
> >>> of system services without having to give out CAP_SETUID all over the
> >>> place just so that non-root programs can drop to
> >>> even-further-non-privileged uids. This is especially relevant when one
> >>> non-root daemon on the system should be allowed to spawn other
> >>> processes as different uids, but its undesirable to give the daemon a
> >>> basically-root-equivalent CAP_SETUID.
> >> I don't want to sound stupid(er than usual), but it sounds like
> >> you could do all this using setuid bits prudently. Based on this
> >> description, I don't see that anything new is needed.
> > There are situations where setuid bits don't get the job done, as
> > there are many situations where a program just wants to call setuid as
> > part of its execution (or fork + setuid without exec), instead of
> > fork/exec'ing a setuid binary.
>
> Yes, I understand that.
>
> > Take the following scenario for
> > example: init script (as root) spawns a network manager program as uid
> > 1000
>
> So far, so good.
>
> > and then the network manager spawns OpenVPN. The common mode of
> > operation for OpenVPN is to start running as the uid it was spawned
> > with (1000) at startup, but then drop to a lesser-privileged uid (e.g.
> > 2000) after initialization/setup by calling setuid.
>
> OK. That's an operation that does and ought to require privilege.

Sure, but the idea behind this LSM is that full CAP_SETUID
capabilities are a lot more privilege than is necessary in this
scenario.

>
> > This is something
> > setuid bits wouldn't help with, without refactoring OpenVPN.
>
> You're correct.
>
> > So one
> > option here is to give the network manager CAP_SETUID, which will be
> > inherited by OpenVPN, and then OpenVPN drops to uid 2000 and drops
> > CAP_SETUID (would probably require patching OpenVPN for the capability
> > dropping).
>
> Or, you put CAP_SETUID on the file capabilities for OpenVPN,
> which is the way the P1003.1e DRAFT specification would have
> you accomplish this. Unfortunately, with all the changes made
> to capabilities for namespaces and all I'm not 100% sure I
> could say exactly how to set that.
>
> > The problem here is that if the network manager itself is
> > untrusted and exploitable, then giving it unrestricted CAP_SETUID is a
> > big security risk.
>
> Right. That's why you set the file capabilities on OpenVPN.

So it seems like you're suggesting that any time a program needs to
switch user by calling setuid, that it should get full CAP_SETUID
capabilities (whether that's through setting file capabilities on the
binary or inheriting CAP_SETUID from a parent process or otherwise).
But that brings us back to the basic problem this LSM is trying to
solve. Namely, we don't want to sprinkle unrestricted CAP_SETUID privs
all over the system for binaries that just want to switch to specific
uid[s] and don't need any of the root-equivalent privileges provided
by CAP_SETUID.

>
> > Even just sticking with the network manager / VPN
> > example, strongSwan VPN also uses the same drop-to-user-through-setuid
> > setup, as do other Linux applications.
>
> Same solution.
>
> > Refactoring these applications
> > to fork/exec setuid binaries instead of simply calling setuid is often
> > infeasible. So a direct call to setuid is often necessary/expected,
> > and setuid bits don't help here.
>
> What is it with kids these days, that they are so
> afraid of fixing code that needs fixing? But that's
> not necessary in this example.
>
> > Also, use of setuid bits precludes the use of the no_new_privs bit,
> > which is usually at least a nice-to-have (if not need-to-have) for
> > sandboxed processes on the system.
>
> But you've already said that you *want* to change the security state,
> "drop to a lesser-privileged uid", so you're already mucking with the
> sandbox. If you're going to say that changing UIDs doesn't count for
> sandboxing I'll point out that you brought up the notion of a
> lesser-privileged UID.

There are plenty of ways that non-root processes further restrict
especially vulnerable parts of their code to even lesser-privileged
contexts. But its often easier to reason about the security of such
applications if the no_new_privs bit is set and file capabilities are
avoided, so the application can have full control of which privileges
are given to spawned processes without having to worry about which
privileges are attached to which files. Granted, the no_new_privs
issue is less central to the LSM being proposed here compared to the
discussion above.

>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-10-31 15:28 [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
  2018-10-31 21:02 ` Serge E. Hallyn
@ 2018-11-02 18:07 ` Stephen Smalley
  2018-11-02 19:13   ` Micah Morton
  2018-11-19 18:54   ` [PATCH] [PATCH] LSM: generalize flag passing to security_capable mortonm
  1 sibling, 2 replies; 88+ messages in thread
From: Stephen Smalley @ 2018-11-02 18:07 UTC (permalink / raw)
  To: mortonm, jmorris, serge, keescook, linux-security-module

On 10/31/18 11:28 AM, mortonm@chromium.org wrote:
> From: Micah Morton <mortonm@chromium.org>
> 
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
> 
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> 
> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> code that likely needs improvement before being an acceptable approach.
> I'm specifically interested to see if there are better ideas for how
> this could be done.

If it were me, I'd modify the callers of ns_capable(..., CAP_SETUID) in 
some manner to let you distinguish rather than trying to test the 
current syscall within the capable hook.  Modify the set*id system calls 
to use a variant interface that passes flags or something; there is 
already precedent for the _noaudit case but it isn't general.  More 
generally, extending ns_capable() and friends to take a variety of 
additional inputs would be useful, e.g. to allow one to pass down the 
inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that one could 
authorize it for specific files rather than all or nothing. This is 
already partly done via capable_wrt_inode_uidgid() but the inode isn't 
propagated down to ns_capable() and thus cannot be passed down to the 
security hook currently.

> 
>   Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
>   Documentation/admin-guide/LSM/index.rst     |   1 +
>   arch/Kconfig                                |   5 +
>   arch/arm/Kconfig                            |   1 +
>   arch/arm64/Kconfig                          |   1 +
>   arch/x86/Kconfig                            |   1 +
>   security/Kconfig                            |   1 +
>   security/Makefile                           |   2 +
>   security/safesetid/Kconfig                  |  13 +
>   security/safesetid/Makefile                 |   7 +
>   security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
>   security/safesetid/lsm.h                    |  30 ++
>   security/safesetid/securityfs.c             | 189 +++++++++++
>   13 files changed, 679 insertions(+)
>   create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>   create mode 100644 security/safesetid/Kconfig
>   create mode 100644 security/safesetid/Makefile
>   create mode 100644 security/safesetid/lsm.c
>   create mode 100644 security/safesetid/lsm.h
>   create mode 100644 security/safesetid/securityfs.c
> 
> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> new file mode 100644
> index 000000000000..e7d072124424
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> @@ -0,0 +1,94 @@
> +=========
> +SafeSetID
> +=========
> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> +UID/GID transitions from a given UID/GID to only those approved by a
> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> +allowing a user to set up user namespace UID mappings.
> +
> +
> +Background
> +==========
> +In absence of file capabilities, processes spawned on a Linux system that need
> +to switch to a different user must be spawned with CAP_SETUID privileges.
> +CAP_SETUID is granted to programs running as root or those running as a non-root
> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> +often preferable to use Linux runtime capabilities rather than file
> +capabilities, since using file capabilities to run a program with elevated
> +privileges opens up possible security holes since any user with access to the
> +file can exec() that program to gain the elevated privileges.
> +
> +While it is possible to implement a tree of processes by giving full
> +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> +tree of processes under non-root user(s) in the first place. Specifically,
> +since CAP_SETUID allows changing to any user on the system, including the root
> +user, it is an overpowered capability for what is needed in this scenario,
> +especially since programs often only call setuid() to drop privileges to a
> +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> +generally feasible way in Linux to restrict the potential UIDs that a user can
> +switch to through setuid() beyond allowing a switch to any user on the system.
> +This SafeSetID LSM seeks to provide a solution for restricting setid
> +capabilities in such a way.
> +
> +
> +Other Approaches Considered
> +===========================
> +
> +Solve this problem in userspace
> +-------------------------------
> +For candidate applications that would like to have restricted setid capabilities
> +as implemented in this LSM, an alternative option would be to simply take away
> +setid capabilities from the application completely and refactor the process
> +spawning semantics in the application (e.g. by using a privileged helper program
> +to do process spawning and UID/GID transitions). Unfortunately, there are a
> +number of semantics around process spawning that would be affected by this, such
> +as fork() calls where the program doesn’t immediately call exec() after the
> +fork(), parent processes specifying custom environment variables or command line
> +args for spawned child processes, or inheritance of file handles across a
> +fork()/exec(). Because of this, as solution that uses a privileged helper in
> +userspace would likely be less appealing to incorporate into existing projects
> +that rely on certain process-spawning semantics in Linux.
> +
> +Use user namespaces
> +-------------------
> +Another possible approach would be to run a given process tree in its own user
> +namespace and give programs in the tree setid capabilities. In this way,
> +programs in the tree could change to any desired UID/GID in the context of their
> +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> +initial system user namespace, affectively preventing privilege escalation.
> +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> +without pairing them with other namespace types, which is not always an option.
> +Linux checks for capabilities based off of the user namespace that “owns” some
> +entity. For example, Linux has the notion that network namespaces are owned by
> +the user namespace in which they were created. A consequence of this is that
> +capability checks for access to a given network namespace are done by checking
> +whether a task has the given capability in the context of the user namespace
> +that owns the network namespace -- not necessarily the user namespace under
> +which the given task runs. Therefore spawning a process in a new user namespace
> +effectively prevents it from accessing the network namespace owned by the
> +initial namespace. This is a deal-breaker for any application that expects to
> +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> +configurations. Using user namespaces in isolation causes problems regarding
> +other system interactions, including use of pid namespaces and device creation.
> +
> +Use an existing LSM
> +-------------------
> +None of the other in-tree LSMs have the capability to gate setid transitions, or
> +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> +"Since setuid only affects the current process, and since the SELinux controls
> +are not based on the Linux identity attributes, SELinux does not need to control
> +this operation."
> +
> +
> +Directions for use
> +==================
> +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> +applicable restriction policy is in place. Policies are configured through
> +securityfs by writing to the safesetid/add_whitelist_policy and
> +safesetid/flush_whitelist_policies files at the location where securityfs is
> +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> +numbers, such as '123:456'. To flush the policies, any write to the file is
> +sufficient. Again, configuring a policy for a UID will prevent that UID from
> +obtaining auxiliary setid privileges, such as allowing a user to set up user
> +namespace UID mappings.
> diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> index c980dfe9abf1..a0c387649e12 100644
> --- a/Documentation/admin-guide/LSM/index.rst
> +++ b/Documentation/admin-guide/LSM/index.rst
> @@ -39,3 +39,4 @@ the one "major" module (e.g. SELinux) if there is one configured.
>      Smack
>      tomoyo
>      Yama
> +   SafeSetID
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 1aa59063f1fd..c87070807ba2 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -381,6 +381,11 @@ config ARCH_WANT_OLD_COMPAT_IPC
>   	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
>   	bool
>   
> +config HAVE_SAFESETID
> +	bool
> +	help
> +	  This option enables the SafeSetID LSM.
> +
>   config HAVE_ARCH_SECCOMP_FILTER
>   	bool
>   	help
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 843edfd000be..35b1a772c971 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -92,6 +92,7 @@ config ARM
>   	select HAVE_RCU_TABLE_FREE if (SMP && ARM_LPAE)
>   	select HAVE_REGS_AND_STACK_ACCESS_API
>   	select HAVE_RSEQ
> +	select HAVE_SAFESETID
>   	select HAVE_STACKPROTECTOR
>   	select HAVE_SYSCALL_TRACEPOINTS
>   	select HAVE_UID16
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 42c090cf0292..2c6f5ec3a55e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -127,6 +127,7 @@ config ARM64
>   	select HAVE_PERF_USER_STACK_DUMP
>   	select HAVE_REGS_AND_STACK_ACCESS_API
>   	select HAVE_RCU_TABLE_FREE
> +	select HAVE_SAFESETID
>   	select HAVE_STACKPROTECTOR
>   	select HAVE_SYSCALL_TRACEPOINTS
>   	select HAVE_KPROBES
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 887d3a7bb646..a6527d6c0426 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -27,6 +27,7 @@ config X86_64
>   	select ARCH_SUPPORTS_INT128
>   	select ARCH_USE_CMPXCHG_LOCKREF
>   	select HAVE_ARCH_SOFT_DIRTY
> +	select HAVE_SAFESETID
>   	select MODULES_USE_ELF_RELA
>   	select NEED_DMA_MAP_STATE
>   	select SWIOTLB
> diff --git a/security/Kconfig b/security/Kconfig
> index c4302067a3ad..7d9008ad5903 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -237,6 +237,7 @@ source security/tomoyo/Kconfig
>   source security/apparmor/Kconfig
>   source security/loadpin/Kconfig
>   source security/yama/Kconfig
> +source security/safesetid/Kconfig
>   
>   source security/integrity/Kconfig
>   
> diff --git a/security/Makefile b/security/Makefile
> index 4d2d3782ddef..88209d827832 100644
> --- a/security/Makefile
> +++ b/security/Makefile
> @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
>   subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
>   subdir-$(CONFIG_SECURITY_YAMA)		+= yama
>   subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
> +subdir-$(CONFIG_SECURITY_SAFESETID)	+= safesetid
>   
>   # always enable default capabilities
>   obj-y					+= commoncap.o
> @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
>   obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
>   obj-$(CONFIG_SECURITY_YAMA)		+= yama/
>   obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
> +obj-$(CONFIG_SECURITY_SAFESETID)	+= safesetid/
>   obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
>   
>   # Object integrity file lists
> diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> new file mode 100644
> index 000000000000..4ff82c7ed273
> --- /dev/null
> +++ b/security/safesetid/Kconfig
> @@ -0,0 +1,13 @@
> +config SECURITY_SAFESETID
> +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> +        depends on HAVE_SAFESETID
> +        default n
> +        help
> +          SafeSetID is an LSM module that gates the setid family of syscalls to
> +          restrict UID/GID transitions from a given UID/GID to only those
> +          approved by a system-wide whitelist. These restrictions also prohibit
> +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> +          UID mappings.
> +
> +          If you are unsure how to answer this question, answer N.
> diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> new file mode 100644
> index 000000000000..6b0660321164
> --- /dev/null
> +++ b/security/safesetid/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for the safesetid LSM.
> +#
> +
> +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> +safesetid-y := lsm.o securityfs.o
> diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> new file mode 100644
> index 000000000000..e30ff06d8e07
> --- /dev/null
> +++ b/security/safesetid/lsm.c
> @@ -0,0 +1,334 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +
> +#define pr_fmt(fmt) "SafeSetID: " fmt
> +
> +#include <asm/syscall.h>
> +#include <linux/hashtable.h>
> +#include <linux/lsm_hooks.h>
> +#include <linux/module.h>
> +#include <linux/ptrace.h>
> +#include <linux/sched/task_stack.h>
> +#include <linux/security.h>
> +
> +#define NUM_BITS 8 /* 128 buckets in hash table */
> +
> +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> +
> +/*
> + * Hash table entry to store safesetid policy signifying that 'parent' user
> + * can setid to 'child' user.
> + */
> +struct entry {
> +	struct hlist_node next;
> +	struct hlist_node dlist; /* for deletion cleanup */
> +	uint64_t parent_kuid;
> +	uint64_t child_kuid;
> +};
> +
> +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> +
> +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> +{
> +	struct entry *entry;
> +
> +	rcu_read_lock();
> +	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> +				   entry, next, __kuid_val(parent)) {
> +		if (entry->parent_kuid == __kuid_val(parent)) {
> +			rcu_read_unlock();
> +			return true;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> +
> +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> +						    kuid_t child)
> +{
> +	struct entry *entry;
> +
> +	rcu_read_lock();
> +	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> +				   entry, next, __kuid_val(parent)) {
> +		if (entry->parent_kuid == __kuid_val(parent) &&
> +		    entry->child_kuid == __kuid_val(child)) {
> +			rcu_read_unlock();
> +			return true;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> +
> +/*
> + * TODO: Figuring out whether the current syscall number (saved on the kernel
> + * stack) is one of the set*uid syscalls is an operation that requires checking
> + * the number against arch-specific constants as seen below. The need for this
> + * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
> + * implement an arch-specific function that gets called from this file and
> + * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
> + * selected for architectures that implement the function? Any other ideas?
> + */
> +static bool setuid_syscall(int num)
> +{
> +#ifdef CONFIG_X86_64
> +#ifdef CONFIG_COMPAT
> +	if (!(num == __NR_setreuid ||
> +	      num == __NR_setuid ||
> +	      num == __NR_setresuid ||
> +	      num == __NR_setfsuid ||
> +	      num == __NR_ia32_setreuid32 ||
> +	      num == __NR_ia32_setuid ||
> +	      num == __NR_ia32_setresuid ||
> +	      num == __NR_ia32_setresuid ||
> +	      num == __NR_ia32_setuid32))
> +		return false;
> +#else
> +	if (!(num == __NR_setreuid ||
> +	      num == __NR_setuid ||
> +	      num == __NR_setresuid ||
> +	      num == __NR_setfsuid))
> +		return false;
> +#endif /* CONFIG_COMPAT */
> +#elif defined CONFIG_ARM64
> +#ifdef CONFIG_COMPAT
> +	if (!(num == __NR_setuid ||
> +	      num == __NR_setreuid ||
> +	      num == __NR_setfsuid ||
> +	      num == __NR_setresuid ||
> +	      num == __NR_setreuid32 ||
> +	      num == __NR_setresuid32 ||
> +	      num == __NR_setuid32 ||
> +	      num == __NR_setfsuid32 ||
> +	      num == __NR_compat_setuid ||
> +	      num == __NR_compat_setreuid ||
> +	      num == __NR_compat_setfsuid ||
> +	      num == __NR_compat_setresuid ||
> +	      num == __NR_compat_setreuid32 ||
> +	      num == __NR_compat_setresuid32 ||
> +	      num == __NR_compat_setuid32 ||
> +	      num == __NR_compat_setfsuid32))
> +		return false;
> +#else
> +	if (!(num == __NR_setuid ||
> +	      num == __NR_setreuid ||
> +	      num == __NR_setfsuid ||
> +	      num == __NR_setresuid))
> +		return false;
> +#endif /* CONFIG_COMPAT */
> +#elif defined CONFIG_ARM
> +	if (!(num == __NR_setreuid32 ||
> +	      num == __NR_setuid32 ||
> +	      num == __NR_setresuid32 ||
> +	      num == __NR_setfsuid32))
> +		return false;
> +#else
> +	BUILD_BUG();
> +#endif
> +	return true;
> +}
> +
> +static int safesetid_security_capable(const struct cred *cred,
> +				      struct user_namespace *ns,
> +				      int cap,
> +				      int audit)
> +{
> +	/* The current->mm check will fail if this is a kernel thread. */
> +	if (cap == CAP_SETUID &&
> +	    current->mm &&
> +	    check_setuid_policy_hashtable_key(cred->uid)) {
> +		/*
> +		 * syscall_get_nr can theoretically return 0 or -1, but that
> +		 * would signify that the syscall is being aborted due to a
> +		 * signal, so we don't need to check for this case here.
> +		 */
> +		if (!(setuid_syscall(syscall_get_nr(current,
> +						    current_pt_regs()))))
> +			/*
> +			 * Deny if we're not in a set*uid() syscall to avoid
> +			 * giving powers gated by CAP_SETUID that are related
> +			 * to functionality other than calling set*uid() (e.g.
> +			 * allowing user to set up userns uid mappings).
> +			 */
> +			return -1;
> +	}
> +	return 0;
> +}
> +
> +static void setuid_policy_warning(kuid_t parent, kuid_t child)
> +{
> +	pr_warn("UID transition (%d -> %d) blocked",
> +		__kuid_val(parent),
> +		__kuid_val(child));
> +}
> +
> +static int check_uid_transition(kuid_t parent, kuid_t child)
> +{
> +	if (check_setuid_policy_hashtable_key_value(parent, child))
> +		return 0;
> +	setuid_policy_warning(parent, child);
> +	return -1;
> +}
> +
> +/*
> + * Check whether there is either an exception for user under old cred struct to
> + * set*uid to user under new cred struct, or the UID transition is allowed (by
> + * Linux set*uid rules) even without CAP_SETUID.
> + */
> +static int safesetid_task_fix_setuid(struct cred *new,
> +				     const struct cred *old,
> +				     int flags)
> +{
> +
> +	/* Do nothing if there are no setuid restrictions for this UID. */
> +	if (!check_setuid_policy_hashtable_key(old->uid))
> +		return 0;
> +
> +	switch (flags) {
> +	case LSM_SETID_RE:
> +		/*
> +		 * Users for which setuid restrictions exist can only set the
> +		 * real UID to the real UID or the effective UID, unless an
> +		 * explicit whitelist policy allows the transition.
> +		 */
> +		if (!uid_eq(old->uid, new->uid) &&
> +			!uid_eq(old->euid, new->uid)) {
> +			return check_uid_transition(old->uid, new->uid);
> +		}
> +		/*
> +		 * Users for which setuid restrictions exist can only set the
> +		 * effective UID to the real UID, the effective UID, or the
> +		 * saved set-UID, unless an explicit whitelist policy allows
> +		 * the transition.
> +		 */
> +		if (!uid_eq(old->uid, new->euid) &&
> +			!uid_eq(old->euid, new->euid) &&
> +			!uid_eq(old->suid, new->euid)) {
> +			return check_uid_transition(old->euid, new->euid);
> +		}
> +		break;
> +	case LSM_SETID_ID:
> +		/*
> +		 * Users for which setuid restrictions exist cannot change the
> +		 * real UID or saved set-UID unless an explicit whitelist
> +		 * policy allows the transition.
> +		 */
> +		if (!uid_eq(old->uid, new->uid))
> +			return check_uid_transition(old->uid, new->uid);
> +		if (!uid_eq(old->suid, new->suid))
> +			return check_uid_transition(old->suid, new->suid);
> +		break;
> +	case LSM_SETID_RES:
> +		/*
> +		 * Users for which setuid restrictions exist cannot change the
> +		 * real UID, effective UID, or saved set-UID to anything but
> +		 * one of: the current real UID, the current effective UID or
> +		 * the current saved set-user-ID unless an explicit whitelist
> +		 * policy allows the transition.
> +		 */
> +		if (!uid_eq(new->uid, old->uid) &&
> +			!uid_eq(new->uid, old->euid) &&
> +			!uid_eq(new->uid, old->suid)) {
> +			return check_uid_transition(old->uid, new->uid);
> +		}
> +		if (!uid_eq(new->euid, old->uid) &&
> +			!uid_eq(new->euid, old->euid) &&
> +			!uid_eq(new->euid, old->suid)) {
> +			return check_uid_transition(old->euid, new->euid);
> +		}
> +		if (!uid_eq(new->suid, old->uid) &&
> +			!uid_eq(new->suid, old->euid) &&
> +			!uid_eq(new->suid, old->suid)) {
> +			return check_uid_transition(old->suid, new->suid);
> +		}
> +		break;
> +	case LSM_SETID_FS:
> +		/*
> +		 * Users for which setuid restrictions exist cannot change the
> +		 * filesystem UID to anything but one of: the current real UID,
> +		 * the current effective UID or the current saved set-UID
> +		 * unless an explicit whitelist policy allows the transition.
> +		 */
> +		if (!uid_eq(new->fsuid, old->uid)  &&
> +			!uid_eq(new->fsuid, old->euid)  &&
> +			!uid_eq(new->fsuid, old->suid) &&
> +			!uid_eq(new->fsuid, old->fsuid)) {
> +			return check_uid_transition(old->fsuid, new->fsuid);
> +		}
> +		break;
> +	}
> +	return 0;
> +}
> +
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> +{
> +	struct entry *new;
> +
> +	/* Return if entry already exists */
> +	if (check_setuid_policy_hashtable_key_value(parent, child))
> +		return 0;
> +
> +	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> +	if (!new)
> +		return -ENOMEM;
> +	new->parent_kuid = __kuid_val(parent);
> +	new->child_kuid = __kuid_val(child);
> +	spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +	hash_add_rcu(safesetid_whitelist_hashtable,
> +		     &new->next,
> +		     __kuid_val(parent));
> +	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +	return 0;
> +}
> +
> +void flush_safesetid_whitelist_entries(void)
> +{
> +	struct entry *entry;
> +	struct hlist_node *hlist_node;
> +	unsigned int bkt_loop_cursor;
> +	HLIST_HEAD(free_list);
> +
> +	/*
> +	 * Could probably use hash_for_each_rcu here instead, but this should
> +	 * be fine as well.
> +	 */
> +	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> +			   hlist_node, entry, next) {
> +		spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +		hash_del_rcu(&entry->next);
> +		spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +		hlist_add_head(&entry->dlist, &free_list);
> +	}
> +	synchronize_rcu();
> +	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist)
> +		kfree(entry);
> +}
> +
> +static struct security_hook_list safesetid_security_hooks[] = {
> +	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> +	LSM_HOOK_INIT(capable, safesetid_security_capable)
> +};
> +
> +static int __init safesetid_security_init(void)
> +{
> +	security_add_hooks(safesetid_security_hooks,
> +			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> +
> +	return 0;
> +}
> +security_initcall(safesetid_security_init);
> diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> new file mode 100644
> index 000000000000..bf78af9bf314
> --- /dev/null
> +++ b/security/safesetid/lsm.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#ifndef _SAFESETID_H
> +#define _SAFESETID_H
> +
> +#include <linux/types.h>
> +
> +/* Function type. */
> +enum safesetid_whitelist_file_write_type {
> +	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> +	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> +};
> +
> +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> +
> +void flush_safesetid_whitelist_entries(void);
> +
> +#endif /* _SAFESETID_H */
> diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> new file mode 100644
> index 000000000000..ff5fcf2c1b37
> --- /dev/null
> +++ b/security/safesetid/securityfs.c
> @@ -0,0 +1,189 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#include <linux/security.h>
> +#include <linux/cred.h>
> +
> +#include "lsm.h"
> +
> +static struct dentry *safesetid_policy_dir;
> +
> +struct safesetid_file_entry {
> +	const char *name;
> +	enum safesetid_whitelist_file_write_type type;
> +	struct dentry *dentry;
> +};
> +
> +static struct safesetid_file_entry safesetid_files[] = {
> +	{.name = "add_whitelist_policy",
> +	 .type = SAFESETID_WHITELIST_ADD},
> +	{.name = "flush_whitelist_policies",
> +	 .type = SAFESETID_WHITELIST_FLUSH},
> +};
> +
> +/*
> + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> + * variables pointed to by 'parent' and 'child' will get updated but this
> + * function will return an error.
> + */
> +static int parse_safesetid_whitelist_policy(const char __user *buf,
> +					    size_t len,
> +					    kuid_t *parent,
> +					    kuid_t *child)
> +{
> +	char *kern_buf;
> +	char *parent_buf;
> +	char *child_buf;
> +	const char separator[] = ":";
> +	int ret;
> +	size_t first_substring_length;
> +	long parsed_parent;
> +	long parsed_child;
> +
> +	/* Duplicate string from user memory and NULL-terminate */
> +	kern_buf = memdup_user_nul(buf, len);
> +	if (IS_ERR(kern_buf))
> +		return PTR_ERR(kern_buf);
> +
> +	/*
> +	 * Format of |buf| string should be <UID>:<UID>.
> +	 * Find location of ":" in kern_buf (copied from |buf|).
> +	 */
> +	first_substring_length = strcspn(kern_buf, separator);
> +	if (first_substring_length == 0 || first_substring_length == len) {
> +		ret = -EINVAL;
> +		goto free_kern;
> +	}
> +
> +	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> +	if (!parent_buf) {
> +		ret = -ENOMEM;
> +		goto free_kern;
> +	}
> +
> +	ret = kstrtol(parent_buf, 0, &parsed_parent);
> +	if (ret)
> +		goto free_both;
> +
> +	child_buf = kern_buf + first_substring_length + 1;
> +	ret = kstrtol(child_buf, 0, &parsed_child);
> +	if (ret)
> +		goto free_both;
> +
> +	*parent = make_kuid(current_user_ns(), parsed_parent);
> +	if (!uid_valid(*parent)) {
> +		ret = -EINVAL;
> +		goto free_both;
> +	}
> +
> +	*child = make_kuid(current_user_ns(), parsed_child);
> +	if (!uid_valid(*child)) {
> +		ret = -EINVAL;
> +		goto free_both;
> +	}
> +
> +free_both:
> +	kfree(parent_buf);
> +free_kern:
> +	kfree(kern_buf);
> +	return ret;
> +}
> +
> +static ssize_t safesetid_file_write(struct file *file,
> +				    const char __user *buf,
> +				    size_t len,
> +				    loff_t *ppos)
> +{
> +	struct safesetid_file_entry *file_entry =
> +		file->f_inode->i_private;
> +	kuid_t parent;
> +	kuid_t child;
> +	int ret;
> +
> +	if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	if (*ppos != 0)
> +		return -EINVAL;
> +
> +	if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
> +		flush_safesetid_whitelist_entries();
> +		return len;
> +	}
> +
> +	/*
> +	 * If we get to here, must be the case that file_entry->type equals
> +	 * SAFESETID_WHITELIST_ADD
> +	 */
> +	ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> +							 &child);
> +	if (ret)
> +		return ret;
> +
> +	ret = add_safesetid_whitelist_entry(parent, child);
> +	if (ret)
> +		return ret;
> +
> +	/* Return len on success so caller won't keep trying to write */
> +	return len;
> +}
> +
> +static const struct file_operations safesetid_file_fops = {
> +	.write = safesetid_file_write,
> +};
> +
> +static void safesetid_shutdown_securityfs(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +		struct safesetid_file_entry *entry =
> +			&safesetid_files[i];
> +		securityfs_remove(entry->dentry);
> +		entry->dentry = NULL;
> +	}
> +
> +	securityfs_remove(safesetid_policy_dir);
> +	safesetid_policy_dir = NULL;
> +}
> +
> +static int __init safesetid_init_securityfs(void)
> +{
> +	int i;
> +	int ret;
> +
> +	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> +	if (!safesetid_policy_dir) {
> +		ret = PTR_ERR(safesetid_policy_dir);
> +		goto error;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +		struct safesetid_file_entry *entry =
> +			&safesetid_files[i];
> +		entry->dentry = securityfs_create_file(
> +			entry->name, 0200, safesetid_policy_dir,
> +			entry, &safesetid_file_fops);
> +		if (IS_ERR(entry->dentry)) {
> +			ret = PTR_ERR(entry->dentry);
> +			goto error;
> +		}
> +	}
> +
> +	return 0;
> +
> +error:
> +	safesetid_shutdown_securityfs();
> +	return ret;
> +}
> +fs_initcall(safesetid_init_securityfs);
> 


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 17:12             ` Micah Morton
@ 2018-11-02 18:19               ` Casey Schaufler
  2018-11-02 18:30                 ` Serge E. Hallyn
                                   ` (2 more replies)
  0 siblings, 3 replies; 88+ messages in thread
From: Casey Schaufler @ 2018-11-02 18:19 UTC (permalink / raw)
  To: Micah Morton
  Cc: serge, jmorris, Kees Cook, linux-security-module, Casey Schaufler

On 11/2/2018 10:12 AM, Micah Morton wrote:
> On Fri, Nov 2, 2018 at 9:05 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 11/1/2018 12:52 PM, Micah Morton wrote:
>>> On Thu, Nov 1, 2018 at 10:08 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>> On 11/1/2018 9:11 AM, Micah Morton wrote:
>>>>> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>>>>>> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
>>>>>>> Quoting mortonm@chromium.org (mortonm@chromium.org):
>>>>>>>> From: Micah Morton <mortonm@chromium.org>
>>>>>>>>
>>>>>>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
>>>>>>>> transitions from a given UID/GID to only those approved by a
>>>>>>>> system-wide whitelist. These restrictions also prohibit the given
>>>>>>>> UIDs/GIDs from obtaining auxiliary privileges associated with
>>>>>>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
>>>>>>>> mappings. For now, only gating the set*uid family of syscalls is
>>>>>>>> supported, with support for set*gid coming in a future patch set.
>>>>>>>>
>>>>>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
>>>>>>>> code that likely needs improvement before being an acceptable approach.
>>>>>>>> I'm specifically interested to see if there are better ideas for how
>>>>>>>> this could be done.
>>>>>>>>
>>>>>>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
>>>>>>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
>>>>>>>>  arch/Kconfig                                |   5 +
>>>>>>>>  arch/arm/Kconfig                            |   1 +
>>>>>>>>  arch/arm64/Kconfig                          |   1 +
>>>>>>>>  arch/x86/Kconfig                            |   1 +
>>>>>>>>  security/Kconfig                            |   1 +
>>>>>>>>  security/Makefile                           |   2 +
>>>>>>>>  security/safesetid/Kconfig                  |  13 +
>>>>>>>>  security/safesetid/Makefile                 |   7 +
>>>>>>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
>>>>>>>>  security/safesetid/lsm.h                    |  30 ++
>>>>>>>>  security/safesetid/securityfs.c             | 189 +++++++++++
>>>>>>>>  13 files changed, 679 insertions(+)
>>>>>>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>>>>>>>>  create mode 100644 security/safesetid/Kconfig
>>>>>>>>  create mode 100644 security/safesetid/Makefile
>>>>>>>>  create mode 100644 security/safesetid/lsm.c
>>>>>>>>  create mode 100644 security/safesetid/lsm.h
>>>>>>>>  create mode 100644 security/safesetid/securityfs.c
>>>>>>>>
>>>>>>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
>>>>>>>> new file mode 100644
>>>>>>>> index 000000000000..e7d072124424
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
>>>>>>>> @@ -0,0 +1,94 @@
>>>>>>>> +=========
>>>>>>>> +SafeSetID
>>>>>>>> +=========
>>>>>>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
>>>>>>>> +UID/GID transitions from a given UID/GID to only those approved by a
>>>>>>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
>>>>>>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
>>>>>>>> +allowing a user to set up user namespace UID mappings.
>>>>>>>> +
>>>>>>>> +
>>>>>>>> +Background
>>>>>>>> +==========
>>>>>>>> +In absence of file capabilities, processes spawned on a Linux system that need
>>>>>>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
>>>>>>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
>>>>>>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
>>>>>>>> +often preferable to use Linux runtime capabilities rather than file
>>>>>>>> +capabilities, since using file capabilities to run a program with elevated
>>>>>>>> +privileges opens up possible security holes since any user with access to the
>>>>>>>> +file can exec() that program to gain the elevated privileges.
>>>>>>> Not true, see inheritable capabilities.  You also might look at ambient
>>>>>>> capabilities.
>>>>>> So for example with pam_cap.so you could have your N uids each be given
>>>>>> the desired pI, and assign the corrsponding fIs to the files they should
>>>>>> be able to exec with privilege.  No other uids will run those files with
>>>>>> privilege.  *1
>>>>> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
>>>>>
>>>>>> Can you give some more details about exactly how you see SafeSetID being
>>>>>> used?
>>>>> Sure. The main use case for this LSM is to allow a non-root program to
>>>>> transition to other untrusted uids without full blown CAP_SETUID
>>>>> capabilities. The non-root program would still need CAP_SETUID to do
>>>>> any kind of transition, but the additional restrictions imposed by
>>>>> this LSM would mean it is a "safer" version of CAP_SETUID since the
>>>>> non-root program cannot take advantage of CAP_SETUID to do any
>>>>> unapproved actions (i.e. setuid to uid 0 or create/enter new user
>>>>> namespace). The higher level goal is to allow for uid-based sandboxing
>>>>> of system services without having to give out CAP_SETUID all over the
>>>>> place just so that non-root programs can drop to
>>>>> even-further-non-privileged uids. This is especially relevant when one
>>>>> non-root daemon on the system should be allowed to spawn other
>>>>> processes as different uids, but its undesirable to give the daemon a
>>>>> basically-root-equivalent CAP_SETUID.
>>>> I don't want to sound stupid(er than usual), but it sounds like
>>>> you could do all this using setuid bits prudently. Based on this
>>>> description, I don't see that anything new is needed.
>>> There are situations where setuid bits don't get the job done, as
>>> there are many situations where a program just wants to call setuid as
>>> part of its execution (or fork + setuid without exec), instead of
>>> fork/exec'ing a setuid binary.
>> Yes, I understand that.
>>
>>> Take the following scenario for
>>> example: init script (as root) spawns a network manager program as uid
>>> 1000
>> So far, so good.
>>
>>> and then the network manager spawns OpenVPN. The common mode of
>>> operation for OpenVPN is to start running as the uid it was spawned
>>> with (1000) at startup, but then drop to a lesser-privileged uid (e.g.
>>> 2000) after initialization/setup by calling setuid.
>> OK. That's an operation that does and ought to require privilege.
> Sure, but the idea behind this LSM is that full CAP_SETUID
> capabilities are a lot more privilege than is necessary in this
> scenario.

I'll start by pointing out that CAP_SETUID is about the finest grained
capability there is. It's very precise in what it allows. I think that
your concern is about the worst case scenario, which is setting the
effective UID to 0, and hence gaining all privilege.


>>> This is something
>>> setuid bits wouldn't help with, without refactoring OpenVPN.
>> You're correct.
>>
>>> So one
>>> option here is to give the network manager CAP_SETUID, which will be
>>> inherited by OpenVPN, and then OpenVPN drops to uid 2000 and drops
>>> CAP_SETUID (would probably require patching OpenVPN for the capability
>>> dropping).
>> Or, you put CAP_SETUID on the file capabilities for OpenVPN,
>> which is the way the P1003.1e DRAFT specification would have
>> you accomplish this. Unfortunately, with all the changes made
>> to capabilities for namespaces and all I'm not 100% sure I
>> could say exactly how to set that.
>>
>>> The problem here is that if the network manager itself is
>>> untrusted and exploitable, then giving it unrestricted CAP_SETUID is a
>>> big security risk.
>> Right. That's why you set the file capabilities on OpenVPN.
> So it seems like you're suggesting that any time a program needs to
> switch user by calling setuid,

... in a way that requires CAP_SETUID ...

> that it should get full CAP_SETUID
> capabilities (whether that's through setting file capabilities on the
> binary or inheriting CAP_SETUID from a parent process or otherwise).

Yup. That's correct. With all the duties and responsibilities associated
with the dangers of UID management. Changing UIDs shouldn't be done
lightly and needs to be done carefully.

> But that brings us back to the basic problem this LSM is trying to
> solve. Namely, we don't want to sprinkle unrestricted CAP_SETUID privs
> all over the system for binaries that just want to switch to specific
> uid[s] and don't need any of the root-equivalent privileges provided
> by CAP_SETUID.

I would see marking a program with a list of UIDs it can run with or
that its children can run with as a better solution. You get much
better locality of reference that way.

>>> Even just sticking with the network manager / VPN
>>> example, strongSwan VPN also uses the same drop-to-user-through-setuid
>>> setup, as do other Linux applications.
>> Same solution.
>>
>>> Refactoring these applications
>>> to fork/exec setuid binaries instead of simply calling setuid is often
>>> infeasible. So a direct call to setuid is often necessary/expected,
>>> and setuid bits don't help here.
>> What is it with kids these days, that they are so
>> afraid of fixing code that needs fixing? But that's
>> not necessary in this example.
>>
>>> Also, use of setuid bits precludes the use of the no_new_privs bit,
>>> which is usually at least a nice-to-have (if not need-to-have) for
>>> sandboxed processes on the system.
>> But you've already said that you *want* to change the security state,
>> "drop to a lesser-privileged uid", so you're already mucking with the
>> sandbox. If you're going to say that changing UIDs doesn't count for
>> sandboxing I'll point out that you brought up the notion of a
>> lesser-privileged UID.
> There are plenty of ways that non-root processes further restrict
> especially vulnerable parts of their code to even lesser-privileged
> contexts. But its often easier to reason about the security of such
> applications if the no_new_privs bit is set and file capabilities are
> avoided, so the application can have full control of which privileges
> are given to spawned processes without having to worry about which
> privileges are attached to which files. Granted, the no_new_privs
> issue is less central to the LSM being proposed here compared to the
> discussion above.

Let me suggest a change to the way your LSM works
that would reduce my concerns. Rather than refusing to
make a UID change that isn't on your whitelist, kill a
process that makes a prohibited request. This mitigates
the problem where a process doesn't check for an error
return. Sure, your system will be harder to get running
until your whitelist is complete, but you'll avoid a
whole category of security bugs.


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 18:19               ` Casey Schaufler
@ 2018-11-02 18:30                 ` Serge E. Hallyn
  2018-11-02 19:02                   ` Casey Schaufler
  2018-11-02 19:28                 ` [PATCH] " Micah Morton
  2018-11-06 19:09                 ` [PATCH v2] " mortonm
  2 siblings, 1 reply; 88+ messages in thread
From: Serge E. Hallyn @ 2018-11-02 18:30 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Micah Morton, serge, jmorris, Kees Cook, linux-security-module

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 11/2/2018 10:12 AM, Micah Morton wrote:
> > On Fri, Nov 2, 2018 at 9:05 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 11/1/2018 12:52 PM, Micah Morton wrote:
> >>> On Thu, Nov 1, 2018 at 10:08 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>> On 11/1/2018 9:11 AM, Micah Morton wrote:
> >>>>> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >>>>>> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> >>>>>>> Quoting mortonm@chromium.org (mortonm@chromium.org):
> >>>>>>>> From: Micah Morton <mortonm@chromium.org>
> >>>>>>>>
> >>>>>>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
> >>>>>>>> transitions from a given UID/GID to only those approved by a
> >>>>>>>> system-wide whitelist. These restrictions also prohibit the given
> >>>>>>>> UIDs/GIDs from obtaining auxiliary privileges associated with
> >>>>>>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> >>>>>>>> mappings. For now, only gating the set*uid family of syscalls is
> >>>>>>>> supported, with support for set*gid coming in a future patch set.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >>>>>>>> ---
> >>>>>>>>
> >>>>>>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> >>>>>>>> code that likely needs improvement before being an acceptable approach.
> >>>>>>>> I'm specifically interested to see if there are better ideas for how
> >>>>>>>> this could be done.
> >>>>>>>>
> >>>>>>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> >>>>>>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
> >>>>>>>>  arch/Kconfig                                |   5 +
> >>>>>>>>  arch/arm/Kconfig                            |   1 +
> >>>>>>>>  arch/arm64/Kconfig                          |   1 +
> >>>>>>>>  arch/x86/Kconfig                            |   1 +
> >>>>>>>>  security/Kconfig                            |   1 +
> >>>>>>>>  security/Makefile                           |   2 +
> >>>>>>>>  security/safesetid/Kconfig                  |  13 +
> >>>>>>>>  security/safesetid/Makefile                 |   7 +
> >>>>>>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> >>>>>>>>  security/safesetid/lsm.h                    |  30 ++
> >>>>>>>>  security/safesetid/securityfs.c             | 189 +++++++++++
> >>>>>>>>  13 files changed, 679 insertions(+)
> >>>>>>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>>>  create mode 100644 security/safesetid/Kconfig
> >>>>>>>>  create mode 100644 security/safesetid/Makefile
> >>>>>>>>  create mode 100644 security/safesetid/lsm.c
> >>>>>>>>  create mode 100644 security/safesetid/lsm.h
> >>>>>>>>  create mode 100644 security/safesetid/securityfs.c
> >>>>>>>>
> >>>>>>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>>> new file mode 100644
> >>>>>>>> index 000000000000..e7d072124424
> >>>>>>>> --- /dev/null
> >>>>>>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>>> @@ -0,0 +1,94 @@
> >>>>>>>> +=========
> >>>>>>>> +SafeSetID
> >>>>>>>> +=========
> >>>>>>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> >>>>>>>> +UID/GID transitions from a given UID/GID to only those approved by a
> >>>>>>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> >>>>>>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> >>>>>>>> +allowing a user to set up user namespace UID mappings.
> >>>>>>>> +
> >>>>>>>> +
> >>>>>>>> +Background
> >>>>>>>> +==========
> >>>>>>>> +In absence of file capabilities, processes spawned on a Linux system that need
> >>>>>>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
> >>>>>>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
> >>>>>>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> >>>>>>>> +often preferable to use Linux runtime capabilities rather than file
> >>>>>>>> +capabilities, since using file capabilities to run a program with elevated
> >>>>>>>> +privileges opens up possible security holes since any user with access to the
> >>>>>>>> +file can exec() that program to gain the elevated privileges.
> >>>>>>> Not true, see inheritable capabilities.  You also might look at ambient
> >>>>>>> capabilities.
> >>>>>> So for example with pam_cap.so you could have your N uids each be given
> >>>>>> the desired pI, and assign the corrsponding fIs to the files they should
> >>>>>> be able to exec with privilege.  No other uids will run those files with
> >>>>>> privilege.  *1
> >>>>> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
> >>>>>
> >>>>>> Can you give some more details about exactly how you see SafeSetID being
> >>>>>> used?
> >>>>> Sure. The main use case for this LSM is to allow a non-root program to
> >>>>> transition to other untrusted uids without full blown CAP_SETUID
> >>>>> capabilities. The non-root program would still need CAP_SETUID to do
> >>>>> any kind of transition, but the additional restrictions imposed by
> >>>>> this LSM would mean it is a "safer" version of CAP_SETUID since the
> >>>>> non-root program cannot take advantage of CAP_SETUID to do any
> >>>>> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> >>>>> namespace). The higher level goal is to allow for uid-based sandboxing
> >>>>> of system services without having to give out CAP_SETUID all over the
> >>>>> place just so that non-root programs can drop to
> >>>>> even-further-non-privileged uids. This is especially relevant when one
> >>>>> non-root daemon on the system should be allowed to spawn other
> >>>>> processes as different uids, but its undesirable to give the daemon a
> >>>>> basically-root-equivalent CAP_SETUID.
> >>>> I don't want to sound stupid(er than usual), but it sounds like
> >>>> you could do all this using setuid bits prudently. Based on this
> >>>> description, I don't see that anything new is needed.
> >>> There are situations where setuid bits don't get the job done, as
> >>> there are many situations where a program just wants to call setuid as
> >>> part of its execution (or fork + setuid without exec), instead of
> >>> fork/exec'ing a setuid binary.
> >> Yes, I understand that.
> >>
> >>> Take the following scenario for
> >>> example: init script (as root) spawns a network manager program as uid
> >>> 1000
> >> So far, so good.
> >>
> >>> and then the network manager spawns OpenVPN. The common mode of
> >>> operation for OpenVPN is to start running as the uid it was spawned
> >>> with (1000) at startup, but then drop to a lesser-privileged uid (e.g.
> >>> 2000) after initialization/setup by calling setuid.
> >> OK. That's an operation that does and ought to require privilege.
> > Sure, but the idea behind this LSM is that full CAP_SETUID
> > capabilities are a lot more privilege than is necessary in this
> > scenario.
> 
> I'll start by pointing out that CAP_SETUID is about the finest grained
> capability there is. It's very precise in what it allows. I think that
> your concern is about the worst case scenario, which is setting the
> effective UID to 0, and hence gaining all privilege.
> 
> 
> >>> This is something
> >>> setuid bits wouldn't help with, without refactoring OpenVPN.
> >> You're correct.
> >>
> >>> So one
> >>> option here is to give the network manager CAP_SETUID, which will be
> >>> inherited by OpenVPN, and then OpenVPN drops to uid 2000 and drops
> >>> CAP_SETUID (would probably require patching OpenVPN for the capability
> >>> dropping).
> >> Or, you put CAP_SETUID on the file capabilities for OpenVPN,
> >> which is the way the P1003.1e DRAFT specification would have
> >> you accomplish this. Unfortunately, with all the changes made
> >> to capabilities for namespaces and all I'm not 100% sure I
> >> could say exactly how to set that.
> >>
> >>> The problem here is that if the network manager itself is
> >>> untrusted and exploitable, then giving it unrestricted CAP_SETUID is a
> >>> big security risk.
> >> Right. That's why you set the file capabilities on OpenVPN.
> > So it seems like you're suggesting that any time a program needs to
> > switch user by calling setuid,
> 
> ... in a way that requires CAP_SETUID ...
> 
> > that it should get full CAP_SETUID
> > capabilities (whether that's through setting file capabilities on the
> > binary or inheriting CAP_SETUID from a parent process or otherwise).
> 
> Yup. That's correct. With all the duties and responsibilities associated
> with the dangers of UID management. Changing UIDs shouldn't be done
> lightly and needs to be done carefully.
> 
> > But that brings us back to the basic problem this LSM is trying to
> > solve. Namely, we don't want to sprinkle unrestricted CAP_SETUID privs
> > all over the system for binaries that just want to switch to specific
> > uid[s] and don't need any of the root-equivalent privileges provided
> > by CAP_SETUID.
> 
> I would see marking a program with a list of UIDs it can run with or
> that its children can run with as a better solution. You get much
> better locality of reference that way.
> 
> >>> Even just sticking with the network manager / VPN
> >>> example, strongSwan VPN also uses the same drop-to-user-through-setuid
> >>> setup, as do other Linux applications.
> >> Same solution.
> >>
> >>> Refactoring these applications
> >>> to fork/exec setuid binaries instead of simply calling setuid is often
> >>> infeasible. So a direct call to setuid is often necessary/expected,
> >>> and setuid bits don't help here.
> >> What is it with kids these days, that they are so
> >> afraid of fixing code that needs fixing? But that's
> >> not necessary in this example.
> >>
> >>> Also, use of setuid bits precludes the use of the no_new_privs bit,
> >>> which is usually at least a nice-to-have (if not need-to-have) for
> >>> sandboxed processes on the system.
> >> But you've already said that you *want* to change the security state,
> >> "drop to a lesser-privileged uid", so you're already mucking with the
> >> sandbox. If you're going to say that changing UIDs doesn't count for
> >> sandboxing I'll point out that you brought up the notion of a
> >> lesser-privileged UID.
> > There are plenty of ways that non-root processes further restrict
> > especially vulnerable parts of their code to even lesser-privileged
> > contexts. But its often easier to reason about the security of such
> > applications if the no_new_privs bit is set and file capabilities are
> > avoided, so the application can have full control of which privileges
> > are given to spawned processes without having to worry about which
> > privileges are attached to which files. Granted, the no_new_privs
> > issue is less central to the LSM being proposed here compared to the
> > discussion above.
> 
> Let me suggest a change to the way your LSM works
> that would reduce my concerns. Rather than refusing to
> make a UID change that isn't on your whitelist, kill a
> process that makes a prohibited request. This mitigates
> the problem where a process doesn't check for an error
> return. Sure, your system will be harder to get running
> until your whitelist is complete, but you'll avoid a
> whole category of security bugs.

Might also consider not restricting CAP_SETUID, but instead adding a
new CAP_SETUID_RANGE capability.  That way you can be sure there will be
no regressions with any programs which run with CAP_SETUID.

Though that violates what Casey was just arguing halfway up the email.

-serge

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 18:30                 ` Serge E. Hallyn
@ 2018-11-02 19:02                   ` Casey Schaufler
  2018-11-02 19:22                     ` Serge E. Hallyn
  0 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2018-11-02 19:02 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Micah Morton, jmorris, Kees Cook, linux-security-module

On 11/2/2018 11:30 AM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey@schaufler-ca.com):
>
>> Let me suggest a change to the way your LSM works
>> that would reduce my concerns. Rather than refusing to
>> make a UID change that isn't on your whitelist, kill a
>> process that makes a prohibited request. This mitigates
>> the problem where a process doesn't check for an error
>> return. Sure, your system will be harder to get running
>> until your whitelist is complete, but you'll avoid a
>> whole category of security bugs.
> Might also consider not restricting CAP_SETUID, but instead adding a
> new CAP_SETUID_RANGE capability.  That way you can be sure there will be
> no regressions with any programs which run with CAP_SETUID.
>
> Though that violates what Casey was just arguing halfway up the email.

I know that it's hard to believe 20 years after the fact,
but the POSIX group worked very hard to ensure that the granularity
of capabilities was correct for the security policy that the
interfaces defined in P1003.1. What would CAP_SETUID_RANGE mean?


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 18:07 ` [PATCH] " Stephen Smalley
@ 2018-11-02 19:13   ` Micah Morton
  2018-11-19 18:54   ` [PATCH] [PATCH] LSM: generalize flag passing to security_capable mortonm
  1 sibling, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-02 19:13 UTC (permalink / raw)
  To: sds; +Cc: jmorris, serge, Kees Cook, linux-security-module

On Fri, Nov 2, 2018 at 11:04 AM Stephen Smalley <sds@tycho.nsa.gov> wrote:
>
> On 10/31/18 11:28 AM, mortonm@chromium.org wrote:
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> >
> > NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> > code that likely needs improvement before being an acceptable approach.
> > I'm specifically interested to see if there are better ideas for how
> > this could be done.
>
> If it were me, I'd modify the callers of ns_capable(..., CAP_SETUID) in
> some manner to let you distinguish rather than trying to test the
> current syscall within the capable hook.  Modify the set*id system calls
> to use a variant interface that passes flags or something; there is
> already precedent for the _noaudit case but it isn't general.  More
> generally, extending ns_capable() and friends to take a variety of
> additional inputs would be useful, e.g. to allow one to pass down the
> inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that one could
> authorize it for specific files rather than all or nothing. This is
> already partly done via capable_wrt_inode_uidgid() but the inode isn't
> propagated down to ns_capable() and thus cannot be passed down to the
> security hook currently.

Yeah good point. There are only a few spots in the kernel that call
ns_capable(..., CAP_SETUID), so it would be pretty easy to annotate
all of them with a flag, similar to what is done with the LSM_SETID_*
constants for differentiating the set*uid calls in the
security_task_fix_setuid hook:
https://elixir.bootlin.com/linux/latest/source/include/linux/security.h#L126.
If we are going to add in changes to the kernel beyond registring LSM
hooks, this is probably the way to go.

>
> >
> >   Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> >   Documentation/admin-guide/LSM/index.rst     |   1 +
> >   arch/Kconfig                                |   5 +
> >   arch/arm/Kconfig                            |   1 +
> >   arch/arm64/Kconfig                          |   1 +
> >   arch/x86/Kconfig                            |   1 +
> >   security/Kconfig                            |   1 +
> >   security/Makefile                           |   2 +
> >   security/safesetid/Kconfig                  |  13 +
> >   security/safesetid/Makefile                 |   7 +
> >   security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> >   security/safesetid/lsm.h                    |  30 ++
> >   security/safesetid/securityfs.c             | 189 +++++++++++
> >   13 files changed, 679 insertions(+)
> >   create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >   create mode 100644 security/safesetid/Kconfig
> >   create mode 100644 security/safesetid/Makefile
> >   create mode 100644 security/safesetid/lsm.c
> >   create mode 100644 security/safesetid/lsm.h
> >   create mode 100644 security/safesetid/securityfs.c
> >
> > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > new file mode 100644
> > index 000000000000..e7d072124424
> > --- /dev/null
> > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > @@ -0,0 +1,94 @@
> > +=========
> > +SafeSetID
> > +=========
> > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > +UID/GID transitions from a given UID/GID to only those approved by a
> > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > +allowing a user to set up user namespace UID mappings.
> > +
> > +
> > +Background
> > +==========
> > +In absence of file capabilities, processes spawned on a Linux system that need
> > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > +often preferable to use Linux runtime capabilities rather than file
> > +capabilities, since using file capabilities to run a program with elevated
> > +privileges opens up possible security holes since any user with access to the
> > +file can exec() that program to gain the elevated privileges.
> > +
> > +While it is possible to implement a tree of processes by giving full
> > +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> > +tree of processes under non-root user(s) in the first place. Specifically,
> > +since CAP_SETUID allows changing to any user on the system, including the root
> > +user, it is an overpowered capability for what is needed in this scenario,
> > +especially since programs often only call setuid() to drop privileges to a
> > +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> > +generally feasible way in Linux to restrict the potential UIDs that a user can
> > +switch to through setuid() beyond allowing a switch to any user on the system.
> > +This SafeSetID LSM seeks to provide a solution for restricting setid
> > +capabilities in such a way.
> > +
> > +
> > +Other Approaches Considered
> > +===========================
> > +
> > +Solve this problem in userspace
> > +-------------------------------
> > +For candidate applications that would like to have restricted setid capabilities
> > +as implemented in this LSM, an alternative option would be to simply take away
> > +setid capabilities from the application completely and refactor the process
> > +spawning semantics in the application (e.g. by using a privileged helper program
> > +to do process spawning and UID/GID transitions). Unfortunately, there are a
> > +number of semantics around process spawning that would be affected by this, such
> > +as fork() calls where the program doesn’t immediately call exec() after the
> > +fork(), parent processes specifying custom environment variables or command line
> > +args for spawned child processes, or inheritance of file handles across a
> > +fork()/exec(). Because of this, as solution that uses a privileged helper in
> > +userspace would likely be less appealing to incorporate into existing projects
> > +that rely on certain process-spawning semantics in Linux.
> > +
> > +Use user namespaces
> > +-------------------
> > +Another possible approach would be to run a given process tree in its own user
> > +namespace and give programs in the tree setid capabilities. In this way,
> > +programs in the tree could change to any desired UID/GID in the context of their
> > +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> > +initial system user namespace, affectively preventing privilege escalation.
> > +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> > +without pairing them with other namespace types, which is not always an option.
> > +Linux checks for capabilities based off of the user namespace that “owns” some
> > +entity. For example, Linux has the notion that network namespaces are owned by
> > +the user namespace in which they were created. A consequence of this is that
> > +capability checks for access to a given network namespace are done by checking
> > +whether a task has the given capability in the context of the user namespace
> > +that owns the network namespace -- not necessarily the user namespace under
> > +which the given task runs. Therefore spawning a process in a new user namespace
> > +effectively prevents it from accessing the network namespace owned by the
> > +initial namespace. This is a deal-breaker for any application that expects to
> > +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> > +configurations. Using user namespaces in isolation causes problems regarding
> > +other system interactions, including use of pid namespaces and device creation.
> > +
> > +Use an existing LSM
> > +-------------------
> > +None of the other in-tree LSMs have the capability to gate setid transitions, or
> > +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> > +"Since setuid only affects the current process, and since the SELinux controls
> > +are not based on the Linux identity attributes, SELinux does not need to control
> > +this operation."
> > +
> > +
> > +Directions for use
> > +==================
> > +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> > +applicable restriction policy is in place. Policies are configured through
> > +securityfs by writing to the safesetid/add_whitelist_policy and
> > +safesetid/flush_whitelist_policies files at the location where securityfs is
> > +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> > +numbers, such as '123:456'. To flush the policies, any write to the file is
> > +sufficient. Again, configuring a policy for a UID will prevent that UID from
> > +obtaining auxiliary setid privileges, such as allowing a user to set up user
> > +namespace UID mappings.
> > diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> > index c980dfe9abf1..a0c387649e12 100644
> > --- a/Documentation/admin-guide/LSM/index.rst
> > +++ b/Documentation/admin-guide/LSM/index.rst
> > @@ -39,3 +39,4 @@ the one "major" module (e.g. SELinux) if there is one configured.
> >      Smack
> >      tomoyo
> >      Yama
> > +   SafeSetID
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index 1aa59063f1fd..c87070807ba2 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -381,6 +381,11 @@ config ARCH_WANT_OLD_COMPAT_IPC
> >       select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
> >       bool
> >
> > +config HAVE_SAFESETID
> > +     bool
> > +     help
> > +       This option enables the SafeSetID LSM.
> > +
> >   config HAVE_ARCH_SECCOMP_FILTER
> >       bool
> >       help
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 843edfd000be..35b1a772c971 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -92,6 +92,7 @@ config ARM
> >       select HAVE_RCU_TABLE_FREE if (SMP && ARM_LPAE)
> >       select HAVE_REGS_AND_STACK_ACCESS_API
> >       select HAVE_RSEQ
> > +     select HAVE_SAFESETID
> >       select HAVE_STACKPROTECTOR
> >       select HAVE_SYSCALL_TRACEPOINTS
> >       select HAVE_UID16
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 42c090cf0292..2c6f5ec3a55e 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -127,6 +127,7 @@ config ARM64
> >       select HAVE_PERF_USER_STACK_DUMP
> >       select HAVE_REGS_AND_STACK_ACCESS_API
> >       select HAVE_RCU_TABLE_FREE
> > +     select HAVE_SAFESETID
> >       select HAVE_STACKPROTECTOR
> >       select HAVE_SYSCALL_TRACEPOINTS
> >       select HAVE_KPROBES
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 887d3a7bb646..a6527d6c0426 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -27,6 +27,7 @@ config X86_64
> >       select ARCH_SUPPORTS_INT128
> >       select ARCH_USE_CMPXCHG_LOCKREF
> >       select HAVE_ARCH_SOFT_DIRTY
> > +     select HAVE_SAFESETID
> >       select MODULES_USE_ELF_RELA
> >       select NEED_DMA_MAP_STATE
> >       select SWIOTLB
> > diff --git a/security/Kconfig b/security/Kconfig
> > index c4302067a3ad..7d9008ad5903 100644
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -237,6 +237,7 @@ source security/tomoyo/Kconfig
> >   source security/apparmor/Kconfig
> >   source security/loadpin/Kconfig
> >   source security/yama/Kconfig
> > +source security/safesetid/Kconfig
> >
> >   source security/integrity/Kconfig
> >
> > diff --git a/security/Makefile b/security/Makefile
> > index 4d2d3782ddef..88209d827832 100644
> > --- a/security/Makefile
> > +++ b/security/Makefile
> > @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
> >   subdir-$(CONFIG_SECURITY_APPARMOR)  += apparmor
> >   subdir-$(CONFIG_SECURITY_YAMA)              += yama
> >   subdir-$(CONFIG_SECURITY_LOADPIN)   += loadpin
> > +subdir-$(CONFIG_SECURITY_SAFESETID)  += safesetid
> >
> >   # always enable default capabilities
> >   obj-y                                       += commoncap.o
> > @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)               += tomoyo/
> >   obj-$(CONFIG_SECURITY_APPARMOR)             += apparmor/
> >   obj-$(CONFIG_SECURITY_YAMA)         += yama/
> >   obj-$(CONFIG_SECURITY_LOADPIN)              += loadpin/
> > +obj-$(CONFIG_SECURITY_SAFESETID)     += safesetid/
> >   obj-$(CONFIG_CGROUP_DEVICE)         += device_cgroup.o
> >
> >   # Object integrity file lists
> > diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> > new file mode 100644
> > index 000000000000..4ff82c7ed273
> > --- /dev/null
> > +++ b/security/safesetid/Kconfig
> > @@ -0,0 +1,13 @@
> > +config SECURITY_SAFESETID
> > +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> > +        depends on HAVE_SAFESETID
> > +        default n
> > +        help
> > +          SafeSetID is an LSM module that gates the setid family of syscalls to
> > +          restrict UID/GID transitions from a given UID/GID to only those
> > +          approved by a system-wide whitelist. These restrictions also prohibit
> > +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> > +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> > +          UID mappings.
> > +
> > +          If you are unsure how to answer this question, answer N.
> > diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> > new file mode 100644
> > index 000000000000..6b0660321164
> > --- /dev/null
> > +++ b/security/safesetid/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the safesetid LSM.
> > +#
> > +
> > +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> > +safesetid-y := lsm.o securityfs.o
> > diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> > new file mode 100644
> > index 000000000000..e30ff06d8e07
> > --- /dev/null
> > +++ b/security/safesetid/lsm.c
> > @@ -0,0 +1,334 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +
> > +#define pr_fmt(fmt) "SafeSetID: " fmt
> > +
> > +#include <asm/syscall.h>
> > +#include <linux/hashtable.h>
> > +#include <linux/lsm_hooks.h>
> > +#include <linux/module.h>
> > +#include <linux/ptrace.h>
> > +#include <linux/sched/task_stack.h>
> > +#include <linux/security.h>
> > +
> > +#define NUM_BITS 8 /* 128 buckets in hash table */
> > +
> > +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> > +
> > +/*
> > + * Hash table entry to store safesetid policy signifying that 'parent' user
> > + * can setid to 'child' user.
> > + */
> > +struct entry {
> > +     struct hlist_node next;
> > +     struct hlist_node dlist; /* for deletion cleanup */
> > +     uint64_t parent_kuid;
> > +     uint64_t child_kuid;
> > +};
> > +
> > +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> > +
> > +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> > +{
> > +     struct entry *entry;
> > +
> > +     rcu_read_lock();
> > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > +                                entry, next, __kuid_val(parent)) {
> > +             if (entry->parent_kuid == __kuid_val(parent)) {
> > +                     rcu_read_unlock();
> > +                     return true;
> > +             }
> > +     }
> > +     rcu_read_unlock();
> > +
> > +     return false;
> > +}
> > +
> > +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> > +                                                 kuid_t child)
> > +{
> > +     struct entry *entry;
> > +
> > +     rcu_read_lock();
> > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > +                                entry, next, __kuid_val(parent)) {
> > +             if (entry->parent_kuid == __kuid_val(parent) &&
> > +                 entry->child_kuid == __kuid_val(child)) {
> > +                     rcu_read_unlock();
> > +                     return true;
> > +             }
> > +     }
> > +     rcu_read_unlock();
> > +
> > +     return false;
> > +}
> > +
> > +/*
> > + * TODO: Figuring out whether the current syscall number (saved on the kernel
> > + * stack) is one of the set*uid syscalls is an operation that requires checking
> > + * the number against arch-specific constants as seen below. The need for this
> > + * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
> > + * implement an arch-specific function that gets called from this file and
> > + * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
> > + * selected for architectures that implement the function? Any other ideas?
> > + */
> > +static bool setuid_syscall(int num)
> > +{
> > +#ifdef CONFIG_X86_64
> > +#ifdef CONFIG_COMPAT
> > +     if (!(num == __NR_setreuid ||
> > +           num == __NR_setuid ||
> > +           num == __NR_setresuid ||
> > +           num == __NR_setfsuid ||
> > +           num == __NR_ia32_setreuid32 ||
> > +           num == __NR_ia32_setuid ||
> > +           num == __NR_ia32_setresuid ||
> > +           num == __NR_ia32_setresuid ||
> > +           num == __NR_ia32_setuid32))
> > +             return false;
> > +#else
> > +     if (!(num == __NR_setreuid ||
> > +           num == __NR_setuid ||
> > +           num == __NR_setresuid ||
> > +           num == __NR_setfsuid))
> > +             return false;
> > +#endif /* CONFIG_COMPAT */
> > +#elif defined CONFIG_ARM64
> > +#ifdef CONFIG_COMPAT
> > +     if (!(num == __NR_setuid ||
> > +           num == __NR_setreuid ||
> > +           num == __NR_setfsuid ||
> > +           num == __NR_setresuid ||
> > +           num == __NR_setreuid32 ||
> > +           num == __NR_setresuid32 ||
> > +           num == __NR_setuid32 ||
> > +           num == __NR_setfsuid32 ||
> > +           num == __NR_compat_setuid ||
> > +           num == __NR_compat_setreuid ||
> > +           num == __NR_compat_setfsuid ||
> > +           num == __NR_compat_setresuid ||
> > +           num == __NR_compat_setreuid32 ||
> > +           num == __NR_compat_setresuid32 ||
> > +           num == __NR_compat_setuid32 ||
> > +           num == __NR_compat_setfsuid32))
> > +             return false;
> > +#else
> > +     if (!(num == __NR_setuid ||
> > +           num == __NR_setreuid ||
> > +           num == __NR_setfsuid ||
> > +           num == __NR_setresuid))
> > +             return false;
> > +#endif /* CONFIG_COMPAT */
> > +#elif defined CONFIG_ARM
> > +     if (!(num == __NR_setreuid32 ||
> > +           num == __NR_setuid32 ||
> > +           num == __NR_setresuid32 ||
> > +           num == __NR_setfsuid32))
> > +             return false;
> > +#else
> > +     BUILD_BUG();
> > +#endif
> > +     return true;
> > +}
> > +
> > +static int safesetid_security_capable(const struct cred *cred,
> > +                                   struct user_namespace *ns,
> > +                                   int cap,
> > +                                   int audit)
> > +{
> > +     /* The current->mm check will fail if this is a kernel thread. */
> > +     if (cap == CAP_SETUID &&
> > +         current->mm &&
> > +         check_setuid_policy_hashtable_key(cred->uid)) {
> > +             /*
> > +              * syscall_get_nr can theoretically return 0 or -1, but that
> > +              * would signify that the syscall is being aborted due to a
> > +              * signal, so we don't need to check for this case here.
> > +              */
> > +             if (!(setuid_syscall(syscall_get_nr(current,
> > +                                                 current_pt_regs()))))
> > +                     /*
> > +                      * Deny if we're not in a set*uid() syscall to avoid
> > +                      * giving powers gated by CAP_SETUID that are related
> > +                      * to functionality other than calling set*uid() (e.g.
> > +                      * allowing user to set up userns uid mappings).
> > +                      */
> > +                     return -1;
> > +     }
> > +     return 0;
> > +}
> > +
> > +static void setuid_policy_warning(kuid_t parent, kuid_t child)
> > +{
> > +     pr_warn("UID transition (%d -> %d) blocked",
> > +             __kuid_val(parent),
> > +             __kuid_val(child));
> > +}
> > +
> > +static int check_uid_transition(kuid_t parent, kuid_t child)
> > +{
> > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > +             return 0;
> > +     setuid_policy_warning(parent, child);
> > +     return -1;
> > +}
> > +
> > +/*
> > + * Check whether there is either an exception for user under old cred struct to
> > + * set*uid to user under new cred struct, or the UID transition is allowed (by
> > + * Linux set*uid rules) even without CAP_SETUID.
> > + */
> > +static int safesetid_task_fix_setuid(struct cred *new,
> > +                                  const struct cred *old,
> > +                                  int flags)
> > +{
> > +
> > +     /* Do nothing if there are no setuid restrictions for this UID. */
> > +     if (!check_setuid_policy_hashtable_key(old->uid))
> > +             return 0;
> > +
> > +     switch (flags) {
> > +     case LSM_SETID_RE:
> > +             /*
> > +              * Users for which setuid restrictions exist can only set the
> > +              * real UID to the real UID or the effective UID, unless an
> > +              * explicit whitelist policy allows the transition.
> > +              */
> > +             if (!uid_eq(old->uid, new->uid) &&
> > +                     !uid_eq(old->euid, new->uid)) {
> > +                     return check_uid_transition(old->uid, new->uid);
> > +             }
> > +             /*
> > +              * Users for which setuid restrictions exist can only set the
> > +              * effective UID to the real UID, the effective UID, or the
> > +              * saved set-UID, unless an explicit whitelist policy allows
> > +              * the transition.
> > +              */
> > +             if (!uid_eq(old->uid, new->euid) &&
> > +                     !uid_eq(old->euid, new->euid) &&
> > +                     !uid_eq(old->suid, new->euid)) {
> > +                     return check_uid_transition(old->euid, new->euid);
> > +             }
> > +             break;
> > +     case LSM_SETID_ID:
> > +             /*
> > +              * Users for which setuid restrictions exist cannot change the
> > +              * real UID or saved set-UID unless an explicit whitelist
> > +              * policy allows the transition.
> > +              */
> > +             if (!uid_eq(old->uid, new->uid))
> > +                     return check_uid_transition(old->uid, new->uid);
> > +             if (!uid_eq(old->suid, new->suid))
> > +                     return check_uid_transition(old->suid, new->suid);
> > +             break;
> > +     case LSM_SETID_RES:
> > +             /*
> > +              * Users for which setuid restrictions exist cannot change the
> > +              * real UID, effective UID, or saved set-UID to anything but
> > +              * one of: the current real UID, the current effective UID or
> > +              * the current saved set-user-ID unless an explicit whitelist
> > +              * policy allows the transition.
> > +              */
> > +             if (!uid_eq(new->uid, old->uid) &&
> > +                     !uid_eq(new->uid, old->euid) &&
> > +                     !uid_eq(new->uid, old->suid)) {
> > +                     return check_uid_transition(old->uid, new->uid);
> > +             }
> > +             if (!uid_eq(new->euid, old->uid) &&
> > +                     !uid_eq(new->euid, old->euid) &&
> > +                     !uid_eq(new->euid, old->suid)) {
> > +                     return check_uid_transition(old->euid, new->euid);
> > +             }
> > +             if (!uid_eq(new->suid, old->uid) &&
> > +                     !uid_eq(new->suid, old->euid) &&
> > +                     !uid_eq(new->suid, old->suid)) {
> > +                     return check_uid_transition(old->suid, new->suid);
> > +             }
> > +             break;
> > +     case LSM_SETID_FS:
> > +             /*
> > +              * Users for which setuid restrictions exist cannot change the
> > +              * filesystem UID to anything but one of: the current real UID,
> > +              * the current effective UID or the current saved set-UID
> > +              * unless an explicit whitelist policy allows the transition.
> > +              */
> > +             if (!uid_eq(new->fsuid, old->uid)  &&
> > +                     !uid_eq(new->fsuid, old->euid)  &&
> > +                     !uid_eq(new->fsuid, old->suid) &&
> > +                     !uid_eq(new->fsuid, old->fsuid)) {
> > +                     return check_uid_transition(old->fsuid, new->fsuid);
> > +             }
> > +             break;
> > +     }
> > +     return 0;
> > +}
> > +
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> > +{
> > +     struct entry *new;
> > +
> > +     /* Return if entry already exists */
> > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > +             return 0;
> > +
> > +     new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> > +     if (!new)
> > +             return -ENOMEM;
> > +     new->parent_kuid = __kuid_val(parent);
> > +     new->child_kuid = __kuid_val(child);
> > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +     hash_add_rcu(safesetid_whitelist_hashtable,
> > +                  &new->next,
> > +                  __kuid_val(parent));
> > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +     return 0;
> > +}
> > +
> > +void flush_safesetid_whitelist_entries(void)
> > +{
> > +     struct entry *entry;
> > +     struct hlist_node *hlist_node;
> > +     unsigned int bkt_loop_cursor;
> > +     HLIST_HEAD(free_list);
> > +
> > +     /*
> > +      * Could probably use hash_for_each_rcu here instead, but this should
> > +      * be fine as well.
> > +      */
> > +     hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> > +                        hlist_node, entry, next) {
> > +             spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +             hash_del_rcu(&entry->next);
> > +             spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +             hlist_add_head(&entry->dlist, &free_list);
> > +     }
> > +     synchronize_rcu();
> > +     hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist)
> > +             kfree(entry);
> > +}
> > +
> > +static struct security_hook_list safesetid_security_hooks[] = {
> > +     LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> > +     LSM_HOOK_INIT(capable, safesetid_security_capable)
> > +};
> > +
> > +static int __init safesetid_security_init(void)
> > +{
> > +     security_add_hooks(safesetid_security_hooks,
> > +                        ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> > +
> > +     return 0;
> > +}
> > +security_initcall(safesetid_security_init);
> > diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> > new file mode 100644
> > index 000000000000..bf78af9bf314
> > --- /dev/null
> > +++ b/security/safesetid/lsm.h
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#ifndef _SAFESETID_H
> > +#define _SAFESETID_H
> > +
> > +#include <linux/types.h>
> > +
> > +/* Function type. */
> > +enum safesetid_whitelist_file_write_type {
> > +     SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> > +     SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> > +};
> > +
> > +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> > +
> > +void flush_safesetid_whitelist_entries(void);
> > +
> > +#endif /* _SAFESETID_H */
> > diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> > new file mode 100644
> > index 000000000000..ff5fcf2c1b37
> > --- /dev/null
> > +++ b/security/safesetid/securityfs.c
> > @@ -0,0 +1,189 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#include <linux/security.h>
> > +#include <linux/cred.h>
> > +
> > +#include "lsm.h"
> > +
> > +static struct dentry *safesetid_policy_dir;
> > +
> > +struct safesetid_file_entry {
> > +     const char *name;
> > +     enum safesetid_whitelist_file_write_type type;
> > +     struct dentry *dentry;
> > +};
> > +
> > +static struct safesetid_file_entry safesetid_files[] = {
> > +     {.name = "add_whitelist_policy",
> > +      .type = SAFESETID_WHITELIST_ADD},
> > +     {.name = "flush_whitelist_policies",
> > +      .type = SAFESETID_WHITELIST_FLUSH},
> > +};
> > +
> > +/*
> > + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> > + * variables pointed to by 'parent' and 'child' will get updated but this
> > + * function will return an error.
> > + */
> > +static int parse_safesetid_whitelist_policy(const char __user *buf,
> > +                                         size_t len,
> > +                                         kuid_t *parent,
> > +                                         kuid_t *child)
> > +{
> > +     char *kern_buf;
> > +     char *parent_buf;
> > +     char *child_buf;
> > +     const char separator[] = ":";
> > +     int ret;
> > +     size_t first_substring_length;
> > +     long parsed_parent;
> > +     long parsed_child;
> > +
> > +     /* Duplicate string from user memory and NULL-terminate */
> > +     kern_buf = memdup_user_nul(buf, len);
> > +     if (IS_ERR(kern_buf))
> > +             return PTR_ERR(kern_buf);
> > +
> > +     /*
> > +      * Format of |buf| string should be <UID>:<UID>.
> > +      * Find location of ":" in kern_buf (copied from |buf|).
> > +      */
> > +     first_substring_length = strcspn(kern_buf, separator);
> > +     if (first_substring_length == 0 || first_substring_length == len) {
> > +             ret = -EINVAL;
> > +             goto free_kern;
> > +     }
> > +
> > +     parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> > +     if (!parent_buf) {
> > +             ret = -ENOMEM;
> > +             goto free_kern;
> > +     }
> > +
> > +     ret = kstrtol(parent_buf, 0, &parsed_parent);
> > +     if (ret)
> > +             goto free_both;
> > +
> > +     child_buf = kern_buf + first_substring_length + 1;
> > +     ret = kstrtol(child_buf, 0, &parsed_child);
> > +     if (ret)
> > +             goto free_both;
> > +
> > +     *parent = make_kuid(current_user_ns(), parsed_parent);
> > +     if (!uid_valid(*parent)) {
> > +             ret = -EINVAL;
> > +             goto free_both;
> > +     }
> > +
> > +     *child = make_kuid(current_user_ns(), parsed_child);
> > +     if (!uid_valid(*child)) {
> > +             ret = -EINVAL;
> > +             goto free_both;
> > +     }
> > +
> > +free_both:
> > +     kfree(parent_buf);
> > +free_kern:
> > +     kfree(kern_buf);
> > +     return ret;
> > +}
> > +
> > +static ssize_t safesetid_file_write(struct file *file,
> > +                                 const char __user *buf,
> > +                                 size_t len,
> > +                                 loff_t *ppos)
> > +{
> > +     struct safesetid_file_entry *file_entry =
> > +             file->f_inode->i_private;
> > +     kuid_t parent;
> > +     kuid_t child;
> > +     int ret;
> > +
> > +     if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> > +             return -EPERM;
> > +
> > +     if (*ppos != 0)
> > +             return -EINVAL;
> > +
> > +     if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
> > +             flush_safesetid_whitelist_entries();
> > +             return len;
> > +     }
> > +
> > +     /*
> > +      * If we get to here, must be the case that file_entry->type equals
> > +      * SAFESETID_WHITELIST_ADD
> > +      */
> > +     ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> > +                                                      &child);
> > +     if (ret)
> > +             return ret;
> > +
> > +     ret = add_safesetid_whitelist_entry(parent, child);
> > +     if (ret)
> > +             return ret;
> > +
> > +     /* Return len on success so caller won't keep trying to write */
> > +     return len;
> > +}
> > +
> > +static const struct file_operations safesetid_file_fops = {
> > +     .write = safesetid_file_write,
> > +};
> > +
> > +static void safesetid_shutdown_securityfs(void)
> > +{
> > +     int i;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +             struct safesetid_file_entry *entry =
> > +                     &safesetid_files[i];
> > +             securityfs_remove(entry->dentry);
> > +             entry->dentry = NULL;
> > +     }
> > +
> > +     securityfs_remove(safesetid_policy_dir);
> > +     safesetid_policy_dir = NULL;
> > +}
> > +
> > +static int __init safesetid_init_securityfs(void)
> > +{
> > +     int i;
> > +     int ret;
> > +
> > +     safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> > +     if (!safesetid_policy_dir) {
> > +             ret = PTR_ERR(safesetid_policy_dir);
> > +             goto error;
> > +     }
> > +
> > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +             struct safesetid_file_entry *entry =
> > +                     &safesetid_files[i];
> > +             entry->dentry = securityfs_create_file(
> > +                     entry->name, 0200, safesetid_policy_dir,
> > +                     entry, &safesetid_file_fops);
> > +             if (IS_ERR(entry->dentry)) {
> > +                     ret = PTR_ERR(entry->dentry);
> > +                     goto error;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +
> > +error:
> > +     safesetid_shutdown_securityfs();
> > +     return ret;
> > +}
> > +fs_initcall(safesetid_init_securityfs);
> >
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 19:02                   ` Casey Schaufler
@ 2018-11-02 19:22                     ` Serge E. Hallyn
  2018-11-08 20:53                       ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Serge E. Hallyn @ 2018-11-02 19:22 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Serge E. Hallyn, Micah Morton, jmorris, Kees Cook, linux-security-module

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 11/2/2018 11:30 AM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey@schaufler-ca.com):
> >
> >> Let me suggest a change to the way your LSM works
> >> that would reduce my concerns. Rather than refusing to
> >> make a UID change that isn't on your whitelist, kill a
> >> process that makes a prohibited request. This mitigates
> >> the problem where a process doesn't check for an error
> >> return. Sure, your system will be harder to get running
> >> until your whitelist is complete, but you'll avoid a
> >> whole category of security bugs.
> > Might also consider not restricting CAP_SETUID, but instead adding a
> > new CAP_SETUID_RANGE capability.  That way you can be sure there will be
> > no regressions with any programs which run with CAP_SETUID.
> >
> > Though that violates what Casey was just arguing halfway up the email.
> 
> I know that it's hard to believe 20 years after the fact,
> but the POSIX group worked very hard to ensure that the granularity
> of capabilities was correct for the security policy that the
> interfaces defined in P1003.1. What would CAP_SETUID_RANGE mean?

CAP_SETUID would mean you can switch to any uid.

CAP_SETUID_RANGE would mean you could make the transitions which have
been defined through <handwave> some mechanism.  Be it prctl, or some
new security.uidrange xattr, or the mechanism Micah proposed, except
it only applies for CAP_SETUID_RANGE not CAP_SETUID.

-serge

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 18:19               ` Casey Schaufler
  2018-11-02 18:30                 ` Serge E. Hallyn
@ 2018-11-02 19:28                 ` Micah Morton
  2018-11-06 19:09                 ` [PATCH v2] " mortonm
  2 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-02 19:28 UTC (permalink / raw)
  To: casey; +Cc: serge, jmorris, Kees Cook, linux-security-module

On Fri, Nov 2, 2018 at 11:19 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 11/2/2018 10:12 AM, Micah Morton wrote:
> > On Fri, Nov 2, 2018 at 9:05 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 11/1/2018 12:52 PM, Micah Morton wrote:
> >>> On Thu, Nov 1, 2018 at 10:08 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >>>> On 11/1/2018 9:11 AM, Micah Morton wrote:
> >>>>> On Wed, Oct 31, 2018 at 11:07 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >>>>>> On Wed, Oct 31, 2018 at 09:02:45PM +0000, Serge E. Hallyn wrote:
> >>>>>>> Quoting mortonm@chromium.org (mortonm@chromium.org):
> >>>>>>>> From: Micah Morton <mortonm@chromium.org>
> >>>>>>>>
> >>>>>>>> SafeSetID gates the setid family of syscalls to restrict UID/GID
> >>>>>>>> transitions from a given UID/GID to only those approved by a
> >>>>>>>> system-wide whitelist. These restrictions also prohibit the given
> >>>>>>>> UIDs/GIDs from obtaining auxiliary privileges associated with
> >>>>>>>> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> >>>>>>>> mappings. For now, only gating the set*uid family of syscalls is
> >>>>>>>> supported, with support for set*gid coming in a future patch set.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >>>>>>>> ---
> >>>>>>>>
> >>>>>>>> NOTE: See the TODO above setuid_syscall() in lsm.c for an aspect of this
> >>>>>>>> code that likely needs improvement before being an acceptable approach.
> >>>>>>>> I'm specifically interested to see if there are better ideas for how
> >>>>>>>> this could be done.
> >>>>>>>>
> >>>>>>>>  Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
> >>>>>>>>  Documentation/admin-guide/LSM/index.rst     |   1 +
> >>>>>>>>  arch/Kconfig                                |   5 +
> >>>>>>>>  arch/arm/Kconfig                            |   1 +
> >>>>>>>>  arch/arm64/Kconfig                          |   1 +
> >>>>>>>>  arch/x86/Kconfig                            |   1 +
> >>>>>>>>  security/Kconfig                            |   1 +
> >>>>>>>>  security/Makefile                           |   2 +
> >>>>>>>>  security/safesetid/Kconfig                  |  13 +
> >>>>>>>>  security/safesetid/Makefile                 |   7 +
> >>>>>>>>  security/safesetid/lsm.c                    | 334 ++++++++++++++++++++
> >>>>>>>>  security/safesetid/lsm.h                    |  30 ++
> >>>>>>>>  security/safesetid/securityfs.c             | 189 +++++++++++
> >>>>>>>>  13 files changed, 679 insertions(+)
> >>>>>>>>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>>>  create mode 100644 security/safesetid/Kconfig
> >>>>>>>>  create mode 100644 security/safesetid/Makefile
> >>>>>>>>  create mode 100644 security/safesetid/lsm.c
> >>>>>>>>  create mode 100644 security/safesetid/lsm.h
> >>>>>>>>  create mode 100644 security/safesetid/securityfs.c
> >>>>>>>>
> >>>>>>>> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>>> new file mode 100644
> >>>>>>>> index 000000000000..e7d072124424
> >>>>>>>> --- /dev/null
> >>>>>>>> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> >>>>>>>> @@ -0,0 +1,94 @@
> >>>>>>>> +=========
> >>>>>>>> +SafeSetID
> >>>>>>>> +=========
> >>>>>>>> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> >>>>>>>> +UID/GID transitions from a given UID/GID to only those approved by a
> >>>>>>>> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> >>>>>>>> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> >>>>>>>> +allowing a user to set up user namespace UID mappings.
> >>>>>>>> +
> >>>>>>>> +
> >>>>>>>> +Background
> >>>>>>>> +==========
> >>>>>>>> +In absence of file capabilities, processes spawned on a Linux system that need
> >>>>>>>> +to switch to a different user must be spawned with CAP_SETUID privileges.
> >>>>>>>> +CAP_SETUID is granted to programs running as root or those running as a non-root
> >>>>>>>> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> >>>>>>>> +often preferable to use Linux runtime capabilities rather than file
> >>>>>>>> +capabilities, since using file capabilities to run a program with elevated
> >>>>>>>> +privileges opens up possible security holes since any user with access to the
> >>>>>>>> +file can exec() that program to gain the elevated privileges.
> >>>>>>> Not true, see inheritable capabilities.  You also might look at ambient
> >>>>>>> capabilities.
> >>>>>> So for example with pam_cap.so you could have your N uids each be given
> >>>>>> the desired pI, and assign the corrsponding fIs to the files they should
> >>>>>> be able to exec with privilege.  No other uids will run those files with
> >>>>>> privilege.  *1
> >>>>> Sorry, what are "pl" and "fls" here? "Privilege level" and "files"?
> >>>>>
> >>>>>> Can you give some more details about exactly how you see SafeSetID being
> >>>>>> used?
> >>>>> Sure. The main use case for this LSM is to allow a non-root program to
> >>>>> transition to other untrusted uids without full blown CAP_SETUID
> >>>>> capabilities. The non-root program would still need CAP_SETUID to do
> >>>>> any kind of transition, but the additional restrictions imposed by
> >>>>> this LSM would mean it is a "safer" version of CAP_SETUID since the
> >>>>> non-root program cannot take advantage of CAP_SETUID to do any
> >>>>> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> >>>>> namespace). The higher level goal is to allow for uid-based sandboxing
> >>>>> of system services without having to give out CAP_SETUID all over the
> >>>>> place just so that non-root programs can drop to
> >>>>> even-further-non-privileged uids. This is especially relevant when one
> >>>>> non-root daemon on the system should be allowed to spawn other
> >>>>> processes as different uids, but its undesirable to give the daemon a
> >>>>> basically-root-equivalent CAP_SETUID.
> >>>> I don't want to sound stupid(er than usual), but it sounds like
> >>>> you could do all this using setuid bits prudently. Based on this
> >>>> description, I don't see that anything new is needed.
> >>> There are situations where setuid bits don't get the job done, as
> >>> there are many situations where a program just wants to call setuid as
> >>> part of its execution (or fork + setuid without exec), instead of
> >>> fork/exec'ing a setuid binary.
> >> Yes, I understand that.
> >>
> >>> Take the following scenario for
> >>> example: init script (as root) spawns a network manager program as uid
> >>> 1000
> >> So far, so good.
> >>
> >>> and then the network manager spawns OpenVPN. The common mode of
> >>> operation for OpenVPN is to start running as the uid it was spawned
> >>> with (1000) at startup, but then drop to a lesser-privileged uid (e.g.
> >>> 2000) after initialization/setup by calling setuid.
> >> OK. That's an operation that does and ought to require privilege.
> > Sure, but the idea behind this LSM is that full CAP_SETUID
> > capabilities are a lot more privilege than is necessary in this
> > scenario.
>
> I'll start by pointing out that CAP_SETUID is about the finest grained
> capability there is. It's very precise in what it allows. I think that
> your concern is about the worst case scenario, which is setting the
> effective UID to 0, and hence gaining all privilege.

Yes, that plus the other powers granted by CAP_SETUID apart from
calling one of the set*uid functions:
https://elixir.bootlin.com/linux/latest/ident/CAP_SETUID (e.g. setting
up user ns mappings).

>
>
> >>> This is something
> >>> setuid bits wouldn't help with, without refactoring OpenVPN.
> >> You're correct.
> >>
> >>> So one
> >>> option here is to give the network manager CAP_SETUID, which will be
> >>> inherited by OpenVPN, and then OpenVPN drops to uid 2000 and drops
> >>> CAP_SETUID (would probably require patching OpenVPN for the capability
> >>> dropping).
> >> Or, you put CAP_SETUID on the file capabilities for OpenVPN,
> >> which is the way the P1003.1e DRAFT specification would have
> >> you accomplish this. Unfortunately, with all the changes made
> >> to capabilities for namespaces and all I'm not 100% sure I
> >> could say exactly how to set that.
> >>
> >>> The problem here is that if the network manager itself is
> >>> untrusted and exploitable, then giving it unrestricted CAP_SETUID is a
> >>> big security risk.
> >> Right. That's why you set the file capabilities on OpenVPN.
> > So it seems like you're suggesting that any time a program needs to
> > switch user by calling setuid,
>
> ... in a way that requires CAP_SETUID ...
>
> > that it should get full CAP_SETUID
> > capabilities (whether that's through setting file capabilities on the
> > binary or inheriting CAP_SETUID from a parent process or otherwise).
>
> Yup. That's correct. With all the duties and responsibilities associated
> with the dangers of UID management. Changing UIDs shouldn't be done
> lightly and needs to be done carefully.
>
> > But that brings us back to the basic problem this LSM is trying to
> > solve. Namely, we don't want to sprinkle unrestricted CAP_SETUID privs
> > all over the system for binaries that just want to switch to specific
> > uid[s] and don't need any of the root-equivalent privileges provided
> > by CAP_SETUID.
>
> I would see marking a program with a list of UIDs it can run with or
> that its children can run with as a better solution. You get much
> better locality of reference that way.

AFAICT in this scenario an exploited program could still be tricked
(e.g. command injection) into doing the unapproved actions.

>
> >>> Even just sticking with the network manager / VPN
> >>> example, strongSwan VPN also uses the same drop-to-user-through-setuid
> >>> setup, as do other Linux applications.
> >> Same solution.
> >>
> >>> Refactoring these applications
> >>> to fork/exec setuid binaries instead of simply calling setuid is often
> >>> infeasible. So a direct call to setuid is often necessary/expected,
> >>> and setuid bits don't help here.
> >> What is it with kids these days, that they are so
> >> afraid of fixing code that needs fixing? But that's
> >> not necessary in this example.
> >>
> >>> Also, use of setuid bits precludes the use of the no_new_privs bit,
> >>> which is usually at least a nice-to-have (if not need-to-have) for
> >>> sandboxed processes on the system.
> >> But you've already said that you *want* to change the security state,
> >> "drop to a lesser-privileged uid", so you're already mucking with the
> >> sandbox. If you're going to say that changing UIDs doesn't count for
> >> sandboxing I'll point out that you brought up the notion of a
> >> lesser-privileged UID.
> > There are plenty of ways that non-root processes further restrict
> > especially vulnerable parts of their code to even lesser-privileged
> > contexts. But its often easier to reason about the security of such
> > applications if the no_new_privs bit is set and file capabilities are
> > avoided, so the application can have full control of which privileges
> > are given to spawned processes without having to worry about which
> > privileges are attached to which files. Granted, the no_new_privs
> > issue is less central to the LSM being proposed here compared to the
> > discussion above.
>
> Let me suggest a change to the way your LSM works
> that would reduce my concerns. Rather than refusing to
> make a UID change that isn't on your whitelist, kill a
> process that makes a prohibited request. This mitigates
> the problem where a process doesn't check for an error
> return. Sure, your system will be harder to get running
> until your whitelist is complete, but you'll avoid a
> whole category of security bugs.

That's a valid point. ATM I can't think of any reason I'd be opposed to that.

>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2018-11-02 18:19               ` Casey Schaufler
  2018-11-02 18:30                 ` Serge E. Hallyn
  2018-11-02 19:28                 ` [PATCH] " Micah Morton
@ 2018-11-06 19:09                 ` mortonm
  2 siblings, 0 replies; 88+ messages in thread
From: mortonm @ 2018-11-06 19:09 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 28928 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Added a line in setuid_policy_warning to kill processes which violate
setid whitelist policies. This prevents potential security
vulnerabilities that could arise from a missing whitelist entry
preventing a privileged process from dropping to a lesser-privileged
one.

 Documentation/admin-guide/LSM/SafeSetID.rst |  94 ++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 arch/Kconfig                                |   5 +
 arch/arm/Kconfig                            |   1 +
 arch/arm64/Kconfig                          |   1 +
 arch/x86/Kconfig                            |   1 +
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  13 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 342 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 ++
 security/safesetid/securityfs.c             | 189 +++++++++++
 13 files changed, 687 insertions(+)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..e7d072124424
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,94 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index c980dfe9abf1..a0c387649e12 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -39,3 +39,4 @@ the one "major" module (e.g. SELinux) if there is one configured.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/arch/Kconfig b/arch/Kconfig
index 1aa59063f1fd..c87070807ba2 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -381,6 +381,11 @@ config ARCH_WANT_OLD_COMPAT_IPC
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	bool
 
+config HAVE_SAFESETID
+	bool
+	help
+	  This option enables the SafeSetID LSM.
+
 config HAVE_ARCH_SECCOMP_FILTER
 	bool
 	help
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 843edfd000be..35b1a772c971 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -92,6 +92,7 @@ config ARM
 	select HAVE_RCU_TABLE_FREE if (SMP && ARM_LPAE)
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UID16
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 42c090cf0292..2c6f5ec3a55e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -127,6 +127,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_KPROBES
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 887d3a7bb646..a6527d6c0426 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86_64
 	select ARCH_SUPPORTS_INT128
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select HAVE_ARCH_SOFT_DIRTY
+	select HAVE_SAFESETID
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE
 	select SWIOTLB
diff --git a/security/Kconfig b/security/Kconfig
index c4302067a3ad..7d9008ad5903 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -237,6 +237,7 @@ source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/loadpin/Kconfig
 source security/yama/Kconfig
+source security/safesetid/Kconfig
 
 source security/integrity/Kconfig
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..88209d827832 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)	+= safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)	+= safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..4ff82c7ed273
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,13 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        depends on HAVE_SAFESETID
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..32040f8db7ce
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,342 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+/*
+ * TODO: Figuring out whether the current syscall number (saved on the kernel
+ * stack) is one of the set*uid syscalls is an operation that requires checking
+ * the number against arch-specific constants as seen below. The need for this
+ * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
+ * implement an arch-specific function that gets called from this file and
+ * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
+ * selected for architectures that implement the function? Any other ideas?
+ */
+static bool setuid_syscall(int num)
+{
+#ifdef CONFIG_X86_64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_ia32_setreuid32 ||
+	      num == __NR_ia32_setuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setuid32))
+		return false;
+#else
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setreuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setfsuid32 ||
+	      num == __NR_compat_setuid ||
+	      num == __NR_compat_setreuid ||
+	      num == __NR_compat_setfsuid ||
+	      num == __NR_compat_setresuid ||
+	      num == __NR_compat_setreuid32 ||
+	      num == __NR_compat_setresuid32 ||
+	      num == __NR_compat_setuid32 ||
+	      num == __NR_compat_setfsuid32))
+		return false;
+#else
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM
+	if (!(num == __NR_setreuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setfsuid32))
+		return false;
+#else
+	BUILD_BUG();
+#endif
+	return true;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      int audit)
+{
+	/* The current->mm check will fail if this is a kernel thread. */
+	if (cap == CAP_SETUID &&
+	    current->mm &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		/*
+		 * syscall_get_nr can theoretically return 0 or -1, but that
+		 * would signify that the syscall is being aborted due to a
+		 * signal, so we don't need to check for this case here.
+		 */
+		if (!(setuid_syscall(syscall_get_nr(current,
+						    current_pt_regs()))))
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			return -1;
+	}
+	return 0;
+}
+
+static void setuid_policy_warning(kuid_t parent, kuid_t child)
+{
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+        /*
+         * Kill this process to avoid potential security vulnerabilities
+         * that could arise from a missing whitelist entry preventing a
+         * privileged process from dropping to a lesser-privileged one.
+         */
+        do_exit(SIGKILL);
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	setuid_policy_warning(parent, child);
+	return -1;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+security_initcall(safesetid_security_init);
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..ff5fcf2c1b37
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
+		flush_safesetid_whitelist_entries();
+		return len;
+	}
+
+	/*
+	 * If we get to here, must be the case that file_entry->type equals
+	 * SAFESETID_WHITELIST_ADD
+	 */
+	ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+							 &child);
+	if (ret)
+		return ret;
+
+	ret = add_safesetid_whitelist_entry(parent, child);
+	if (ret)
+		return ret;
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-01 16:11     ` Micah Morton
                         ` (2 preceding siblings ...)
  2018-11-01 17:08       ` Casey Schaufler
@ 2018-11-06 20:59       ` James Morris
  2018-11-06 21:21         ` [PATCH v3] " mortonm
  3 siblings, 1 reply; 88+ messages in thread
From: James Morris @ 2018-11-06 20:59 UTC (permalink / raw)
  To: Micah Morton; +Cc: serge, Kees Cook, linux-security-module

On Thu, 1 Nov 2018, Micah Morton wrote:

> > Can you give some more details about exactly how you see SafeSetID being
> > used?
> 
> Sure. The main use case for this LSM is to allow a non-root program to
> transition to other untrusted uids without full blown CAP_SETUID
> capabilities. The non-root program would still need CAP_SETUID to do
> any kind of transition, but the additional restrictions imposed by
> this LSM would mean it is a "safer" version of CAP_SETUID since the
> non-root program cannot take advantage of CAP_SETUID to do any
> unapproved actions (i.e. setuid to uid 0 or create/enter new user
> namespace). The higher level goal is to allow for uid-based sandboxing
> of system services without having to give out CAP_SETUID all over the
> place just so that non-root programs can drop to
> even-further-non-privileged uids. This is especially relevant when one
> non-root daemon on the system should be allowed to spawn other
> processes as different uids, but its undesirable to give the daemon a
> basically-root-equivalent CAP_SETUID.

Please include this use-case in the kernel documentation.



- James
-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3] LSM: add SafeSetID module that gates setid calls
  2018-11-06 20:59       ` [PATCH] " James Morris
@ 2018-11-06 21:21         ` mortonm
  0 siblings, 0 replies; 88+ messages in thread
From: mortonm @ 2018-11-06 21:21 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 29633 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---

Added use-case description in the kernel documentation.

 Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 arch/Kconfig                                |   5 +
 arch/arm/Kconfig                            |   1 +
 arch/arm64/Kconfig                          |   1 +
 arch/x86/Kconfig                            |   1 +
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  13 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 342 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 ++
 security/safesetid/securityfs.c             | 189 +++++++++++
 13 files changed, 700 insertions(+)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..ffb64be67f7a
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,107 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+The main use case for this LSM is to allow a non-root program to transition to
+other untrusted uids without full blown CAP_SETUID capabilities. The non-root
+program would still need CAP_SETUID to do any kind of transition, but the
+additional restrictions imposed by this LSM would mean it is a "safer" version
+of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
+do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
+namespace). The higher level goal is to allow for uid-based sandboxing of system
+services without having to give out CAP_SETUID all over the place just so that
+non-root programs can drop to even-lesser-privileged uids. This is especially
+relevant when one non-root daemon on the system should be allowed to spawn other
+processes as different uids, but its undesirable to give the daemon a
+basically-root-equivalent CAP_SETUID.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index c980dfe9abf1..a0c387649e12 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -39,3 +39,4 @@ the one "major" module (e.g. SELinux) if there is one configured.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/arch/Kconfig b/arch/Kconfig
index 1aa59063f1fd..c87070807ba2 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -381,6 +381,11 @@ config ARCH_WANT_OLD_COMPAT_IPC
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	bool
 
+config HAVE_SAFESETID
+	bool
+	help
+	  This option enables the SafeSetID LSM.
+
 config HAVE_ARCH_SECCOMP_FILTER
 	bool
 	help
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 843edfd000be..35b1a772c971 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -92,6 +92,7 @@ config ARM
 	select HAVE_RCU_TABLE_FREE if (SMP && ARM_LPAE)
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UID16
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 42c090cf0292..2c6f5ec3a55e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -127,6 +127,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_KPROBES
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 887d3a7bb646..a6527d6c0426 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86_64
 	select ARCH_SUPPORTS_INT128
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select HAVE_ARCH_SOFT_DIRTY
+	select HAVE_SAFESETID
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE
 	select SWIOTLB
diff --git a/security/Kconfig b/security/Kconfig
index c4302067a3ad..7d9008ad5903 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -237,6 +237,7 @@ source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/loadpin/Kconfig
 source security/yama/Kconfig
+source security/safesetid/Kconfig
 
 source security/integrity/Kconfig
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..88209d827832 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)	+= safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)	+= safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..4ff82c7ed273
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,13 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        depends on HAVE_SAFESETID
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..32040f8db7ce
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,342 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+/*
+ * TODO: Figuring out whether the current syscall number (saved on the kernel
+ * stack) is one of the set*uid syscalls is an operation that requires checking
+ * the number against arch-specific constants as seen below. The need for this
+ * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
+ * implement an arch-specific function that gets called from this file and
+ * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
+ * selected for architectures that implement the function? Any other ideas?
+ */
+static bool setuid_syscall(int num)
+{
+#ifdef CONFIG_X86_64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_ia32_setreuid32 ||
+	      num == __NR_ia32_setuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setuid32))
+		return false;
+#else
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setreuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setfsuid32 ||
+	      num == __NR_compat_setuid ||
+	      num == __NR_compat_setreuid ||
+	      num == __NR_compat_setfsuid ||
+	      num == __NR_compat_setresuid ||
+	      num == __NR_compat_setreuid32 ||
+	      num == __NR_compat_setresuid32 ||
+	      num == __NR_compat_setuid32 ||
+	      num == __NR_compat_setfsuid32))
+		return false;
+#else
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM
+	if (!(num == __NR_setreuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setfsuid32))
+		return false;
+#else
+	BUILD_BUG();
+#endif
+	return true;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      int audit)
+{
+	/* The current->mm check will fail if this is a kernel thread. */
+	if (cap == CAP_SETUID &&
+	    current->mm &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		/*
+		 * syscall_get_nr can theoretically return 0 or -1, but that
+		 * would signify that the syscall is being aborted due to a
+		 * signal, so we don't need to check for this case here.
+		 */
+		if (!(setuid_syscall(syscall_get_nr(current,
+						    current_pt_regs()))))
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			return -1;
+	}
+	return 0;
+}
+
+static void setuid_policy_warning(kuid_t parent, kuid_t child)
+{
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+        /*
+         * Kill this process to avoid potential security vulnerabilities
+         * that could arise from a missing whitelist entry preventing a
+         * privileged process from dropping to a lesser-privileged one.
+         */
+        do_exit(SIGKILL);
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	setuid_policy_warning(parent, child);
+	return -1;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+security_initcall(safesetid_security_init);
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..ff5fcf2c1b37
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
+		flush_safesetid_whitelist_entries();
+		return len;
+	}
+
+	/*
+	 * If we get to here, must be the case that file_entry->type equals
+	 * SAFESETID_WHITELIST_ADD
+	 */
+	ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+							 &child);
+	if (ret)
+		return ret;
+
+	ret = add_safesetid_whitelist_entry(parent, child);
+	if (ret)
+		return ret;
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-02 19:22                     ` Serge E. Hallyn
@ 2018-11-08 20:53                       ` Micah Morton
  2018-11-08 21:34                         ` Casey Schaufler
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2018-11-08 20:53 UTC (permalink / raw)
  To: serge; +Cc: casey, jmorris, Kees Cook, linux-security-module

It seems like the CAP_SETUID_RANGE idea proposed by Serge is mainly a
way to avoid silently breaking programs that run with CAP_SETUID,
which could cause security vulnerabilities. Serge, does Casey's
suggestion (killing processes that try to perform unapproved
transitions) make sense to you as an alternate way to safeguard
against this? Sure there could be regressions, but they would only
happen to users for which whitelist policies had been configured, and
killing processes should be an effective way of identifying any
missing whitelist policies on the system for some restricted user. One
less attractive thing about adding a CAP_SETUID_RANGE capability would
be that more of the common kernel code would have to be modified to
add a new capability, whereas currently the LSM just uses the LSM
hooks.

The other unresolved aspect here (discussed by Stephen above) is how
to "know" that ns_capable(..., CAP_SETUID) was called by sys_set*uid
and not some other kernel code path. The reason we need to know this
is to be able to distinguish id transitions from other privileged
actions (e.g. create/modify/enter user namespace). Certain transitions
should be allowed for whitelisted users, but the other privileged
actions should be denied (or else the security hardening provided by
this LSM is significantly weakened). Do people think the current
reliance on comparing the return value of syscall_get_nr() to
arch-specific syscall constants (e.g. __NR_setuid) is a deal-breaker
and we should find an arch-independent solution such as the one
proposed by Stephen? Or is checking against arch-specific constants
not a big deal and the code can stay as is?
On Fri, Nov 2, 2018 at 12:22 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>
> Quoting Casey Schaufler (casey@schaufler-ca.com):
> > On 11/2/2018 11:30 AM, Serge E. Hallyn wrote:
> > > Quoting Casey Schaufler (casey@schaufler-ca.com):
> > >
> > >> Let me suggest a change to the way your LSM works
> > >> that would reduce my concerns. Rather than refusing to
> > >> make a UID change that isn't on your whitelist, kill a
> > >> process that makes a prohibited request. This mitigates
> > >> the problem where a process doesn't check for an error
> > >> return. Sure, your system will be harder to get running
> > >> until your whitelist is complete, but you'll avoid a
> > >> whole category of security bugs.
> > > Might also consider not restricting CAP_SETUID, but instead adding a
> > > new CAP_SETUID_RANGE capability.  That way you can be sure there will be
> > > no regressions with any programs which run with CAP_SETUID.
> > >
> > > Though that violates what Casey was just arguing halfway up the email.
> >
> > I know that it's hard to believe 20 years after the fact,
> > but the POSIX group worked very hard to ensure that the granularity
> > of capabilities was correct for the security policy that the
> > interfaces defined in P1003.1. What would CAP_SETUID_RANGE mean?
>
> CAP_SETUID would mean you can switch to any uid.
>
> CAP_SETUID_RANGE would mean you could make the transitions which have
> been defined through <handwave> some mechanism.  Be it prctl, or some
> new security.uidrange xattr, or the mechanism Micah proposed, except
> it only applies for CAP_SETUID_RANGE not CAP_SETUID.
>
> -serge

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-08 20:53                       ` Micah Morton
@ 2018-11-08 21:34                         ` Casey Schaufler
  2018-11-09  0:30                           ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2018-11-08 21:34 UTC (permalink / raw)
  To: Micah Morton, serge; +Cc: jmorris, Kees Cook, linux-security-module

On 11/8/2018 12:53 PM, Micah Morton wrote:
> It seems like the CAP_SETUID_RANGE idea proposed by Serge is mainly a
> way to avoid silently breaking programs that run with CAP_SETUID,
> which could cause security vulnerabilities.

It wouldn't be "silent". You'd have to fix anything that needs
the new capability instead of the old.

> Serge, does Casey's
> suggestion (killing processes that try to perform unapproved
> transitions) make sense to you as an alternate way to safeguard
> against this? Sure there could be regressions, but they would only
> happen to users for which whitelist policies had been configured, and
> killing processes should be an effective way of identifying any
> missing whitelist policies on the system for some restricted user.

I'll point out that seccomp uses the "kill what you deny" approach.
It would be one thing if it was reasonable to assume that programs
are checking their error returns, but we know they're not.

> One
> less attractive thing about adding a CAP_SETUID_RANGE capability would
> be that more of the common kernel code would have to be modified to
> add a new capability, whereas currently the LSM just uses the LSM
> hooks.

That's got to be a secondary consideration. If CAP_SETUID_RANGE
were the right solution the additional work would be worth doing.

> The other unresolved aspect here (discussed by Stephen above) is how
> to "know" that ns_capable(..., CAP_SETUID) was called by sys_set*uid
> and not some other kernel code path.

It sounds like you want a new LSM hook (security_uid_change?) to
put in that/those code path/paths. That would be a lot cleaner than
trying to infer the syscall.

> The reason we need to know this
> is to be able to distinguish id transitions from other privileged
> actions (e.g. create/modify/enter user namespace). Certain transitions
> should be allowed for whitelisted users, but the other privileged
> actions should be denied (or else the security hardening provided by
> this LSM is significantly weakened). Do people think the current
> reliance on comparing the return value of syscall_get_nr() to
> arch-specific syscall constants (e.g. __NR_setuid) is a deal-breaker
> and we should find an arch-independent solution such as the one
> proposed by Stephen?

The problem with inferring the syscall is that the mechanism for
doing so is subject to change and architectural magic.

> Or is checking against arch-specific constants
> not a big deal and the code can stay as is?

It's bad enough when we have to make LSM code check things
like what filesystem an inode relates to. I would say that
you really want a different approach.

> On Fri, Nov 2, 2018 at 12:22 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>> Quoting Casey Schaufler (casey@schaufler-ca.com):
>>> On 11/2/2018 11:30 AM, Serge E. Hallyn wrote:
>>>> Quoting Casey Schaufler (casey@schaufler-ca.com):
>>>>
>>>>> Let me suggest a change to the way your LSM works
>>>>> that would reduce my concerns. Rather than refusing to
>>>>> make a UID change that isn't on your whitelist, kill a
>>>>> process that makes a prohibited request. This mitigates
>>>>> the problem where a process doesn't check for an error
>>>>> return. Sure, your system will be harder to get running
>>>>> until your whitelist is complete, but you'll avoid a
>>>>> whole category of security bugs.
>>>> Might also consider not restricting CAP_SETUID, but instead adding a
>>>> new CAP_SETUID_RANGE capability.  That way you can be sure there will be
>>>> no regressions with any programs which run with CAP_SETUID.
>>>>
>>>> Though that violates what Casey was just arguing halfway up the email.
>>> I know that it's hard to believe 20 years after the fact,
>>> but the POSIX group worked very hard to ensure that the granularity
>>> of capabilities was correct for the security policy that the
>>> interfaces defined in P1003.1. What would CAP_SETUID_RANGE mean?
>> CAP_SETUID would mean you can switch to any uid.
>>
>> CAP_SETUID_RANGE would mean you could make the transitions which have
>> been defined through <handwave> some mechanism.  Be it prctl, or some
>> new security.uidrange xattr, or the mechanism Micah proposed, except
>> it only applies for CAP_SETUID_RANGE not CAP_SETUID.
>>
>> -serge


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-08 21:34                         ` Casey Schaufler
@ 2018-11-09  0:30                           ` Micah Morton
  2018-11-09 23:21                             ` [PATCH] LSM: generalize flag passing to security_capable mortonm
  2018-11-21 16:54                             ` [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
  0 siblings, 2 replies; 88+ messages in thread
From: Micah Morton @ 2018-11-09  0:30 UTC (permalink / raw)
  To: casey; +Cc: serge, jmorris, Kees Cook, linux-security-module

On Thu, Nov 8, 2018 at 1:34 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 11/8/2018 12:53 PM, Micah Morton wrote:
> > It seems like the CAP_SETUID_RANGE idea proposed by Serge is mainly a
> > way to avoid silently breaking programs that run with CAP_SETUID,
> > which could cause security vulnerabilities.
>
> It wouldn't be "silent". You'd have to fix anything that needs
> the new capability instead of the old.
>
> > Serge, does Casey's
> > suggestion (killing processes that try to perform unapproved
> > transitions) make sense to you as an alternate way to safeguard
> > against this? Sure there could be regressions, but they would only
> > happen to users for which whitelist policies had been configured, and
> > killing processes should be an effective way of identifying any
> > missing whitelist policies on the system for some restricted user.
>
> I'll point out that seccomp uses the "kill what you deny" approach.
> It would be one thing if it was reasonable to assume that programs
> are checking their error returns, but we know they're not.

Totally agree

>
> > One
> > less attractive thing about adding a CAP_SETUID_RANGE capability would
> > be that more of the common kernel code would have to be modified to
> > add a new capability, whereas currently the LSM just uses the LSM
> > hooks.
>
> That's got to be a secondary consideration. If CAP_SETUID_RANGE
> were the right solution the additional work would be worth doing.
>
> > The other unresolved aspect here (discussed by Stephen above) is how
> > to "know" that ns_capable(..., CAP_SETUID) was called by sys_set*uid
> > and not some other kernel code path.
>
> It sounds like you want a new LSM hook (security_uid_change?) to
> put in that/those code path/paths. That would be a lot cleaner than
> trying to infer the syscall.

Providing a new LSM hook would be tricky, because there are places
(besides sys_set*id) that call ns_capable(..., CAP_SETUID) which we
would want to fail if there were whitelist restrictions for the
current uid: https://elixir.bootlin.com/linux/latest/ident/CAP_SETUID.
Even if we changed all non-sys_set*id code paths to call some hook
that will deny them if the current uid is restricted, I'm not sure
there would be any way to ensure that code paths added in the future
would call the hook to restrict themselves in the event that the user
is restricted. I guess our security_capable hook could always deny the
CAP_SETUID check for restricted users but then change the lines in
kernel/sys.c (e.g.
https://elixir.bootlin.com/linux/latest/source/kernel/sys.c#L584) to
do something like 'if (ns_capable(old->user_ns, CAP_SETUID) ||
security_uid_change(old->uid, new->uid))'. But that would be weird
since security_uid_change would have to redo all the checks that
ns_capable did (to basically say _yes_ if the current user has
CAP_SETUID _and_ the transition is allowed in the whitelist).

I might be missing something, but Stephen's idea seems to make more
sense to me. We already pass the
SECURITY_CAP_AUDIT/SECURITY_CAP_NOAUDIT flags to the security_capable
hook to let it know whether to write an audit message for the check.
Seems like it would be easy to extend this to define more bits (e.g.
SECURITY_SETID_TRANSITION) in that flag or have a more general
mechanism where we pass some kind of struct to security_capable that
contains these flags and possibly other args like Stephen was
mentioning. I'll share a patch for that approach, assuming I don't
realize there is some reason that actually wouldn't be a good idea.

>
> > The reason we need to know this
> > is to be able to distinguish id transitions from other privileged
> > actions (e.g. create/modify/enter user namespace). Certain transitions
> > should be allowed for whitelisted users, but the other privileged
> > actions should be denied (or else the security hardening provided by
> > this LSM is significantly weakened). Do people think the current
> > reliance on comparing the return value of syscall_get_nr() to
> > arch-specific syscall constants (e.g. __NR_setuid) is a deal-breaker
> > and we should find an arch-independent solution such as the one
> > proposed by Stephen?
>
> The problem with inferring the syscall is that the mechanism for
> doing so is subject to change and architectural magic.
>
> > Or is checking against arch-specific constants
> > not a big deal and the code can stay as is?
>
> It's bad enough when we have to make LSM code check things
> like what filesystem an inode relates to. I would say that
> you really want a different approach.
>
> > On Fri, Nov 2, 2018 at 12:22 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> >> Quoting Casey Schaufler (casey@schaufler-ca.com):
> >>> On 11/2/2018 11:30 AM, Serge E. Hallyn wrote:
> >>>> Quoting Casey Schaufler (casey@schaufler-ca.com):
> >>>>
> >>>>> Let me suggest a change to the way your LSM works
> >>>>> that would reduce my concerns. Rather than refusing to
> >>>>> make a UID change that isn't on your whitelist, kill a
> >>>>> process that makes a prohibited request. This mitigates
> >>>>> the problem where a process doesn't check for an error
> >>>>> return. Sure, your system will be harder to get running
> >>>>> until your whitelist is complete, but you'll avoid a
> >>>>> whole category of security bugs.
> >>>> Might also consider not restricting CAP_SETUID, but instead adding a
> >>>> new CAP_SETUID_RANGE capability.  That way you can be sure there will be
> >>>> no regressions with any programs which run with CAP_SETUID.
> >>>>
> >>>> Though that violates what Casey was just arguing halfway up the email.
> >>> I know that it's hard to believe 20 years after the fact,
> >>> but the POSIX group worked very hard to ensure that the granularity
> >>> of capabilities was correct for the security policy that the
> >>> interfaces defined in P1003.1. What would CAP_SETUID_RANGE mean?
> >> CAP_SETUID would mean you can switch to any uid.
> >>
> >> CAP_SETUID_RANGE would mean you could make the transitions which have
> >> been defined through <handwave> some mechanism.  Be it prctl, or some
> >> new security.uidrange xattr, or the mechanism Micah proposed, except
> >> it only applies for CAP_SETUID_RANGE not CAP_SETUID.
> >>
> >> -serge
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH] LSM: generalize flag passing to security_capable
  2018-11-09  0:30                           ` Micah Morton
@ 2018-11-09 23:21                             ` mortonm
  2018-11-21 16:54                             ` [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
  1 sibling, 0 replies; 88+ messages in thread
From: mortonm @ 2018-11-09 23:21 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM). This generalization could also support
passing down the inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that
authorization could happen on a per-file basis for specific files rather
than all or nothing.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---

NOTE: Unlike the other patches in this thread, I haven't tested this one
yet besides making sure it compiles. Sending it out as an RFC and follow
up to earlier discussion.

 include/linux/lsm_hooks.h     |  8 ++++---
 include/linux/security.h      | 35 ++++++++++++++++++++----------
 kernel/capability.c           | 41 +++++++++++++++++++++++++++--------
 kernel/seccomp.c              |  7 ++++--
 security/apparmor/lsm.c       |  4 ++--
 security/commoncap.c          | 27 ++++++++++++++++-------
 security/safesetid/lsm.c      |  2 +-
 security/security.c           | 14 +++++-------
 security/selinux/hooks.c      | 13 ++++++-----
 security/smack/smack_access.c |  5 ++++-
 10 files changed, 104 insertions(+), 52 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 8f1131c8dd54..69632ca425fa 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1266,7 +1266,7 @@
  *	@cred contains the credentials to use.
  *	@ns contains the user namespace we want the capability in
  *	@cap contains the capability <include/linux/capability.h>.
- *	@audit contains whether to write an audit message or not
+ *	@opts contains options for the capable check <include/linux/security.h>
  *	Return 0 if the capability is granted for @tsk.
  * @syslog:
  *	Check permission before accessing the kernel message ring or changing
@@ -1442,8 +1442,10 @@ union security_list_options {
 			const kernel_cap_t *effective,
 			const kernel_cap_t *inheritable,
 			const kernel_cap_t *permitted);
-	int (*capable)(const struct cred *cred, struct user_namespace *ns,
-			int cap, int audit);
+	int (*capable)(const struct cred *cred,
+			struct user_namespace *ns,
+			int cap,
+			struct security_capable_opts *opts);
 	int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
 	int (*quota_on)(struct dentry *dentry);
 	int (*syslog)(int type);
diff --git a/include/linux/security.h b/include/linux/security.h
index 63030c85ee19..8db2aeb21c91 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -58,6 +58,13 @@ struct mm_struct;
 #define SECURITY_CAP_NOAUDIT 0
 #define SECURITY_CAP_AUDIT 1
 
+struct security_capable_opts {
+	/* If capable should audit the security request */
+	bool log_audit_message;
+	/* If capable is being called from a setid syscall */
+	bool in_setid;
+};
+
 /* LSM Agnostic defines for sb_set_mnt_opts */
 #define SECURITY_LSM_NATIVE_LABELS	1
 
@@ -72,7 +79,7 @@ enum lsm_event {
 
 /* These functions are in security/commoncap.c */
 extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
-		       int cap, int audit);
+		       int cap, struct security_capable_opts *opts);
 extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
 extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
@@ -159,6 +166,13 @@ extern int mmap_min_addr_handler(struct ctl_table *table, int write,
 typedef int (*initxattrs) (struct inode *inode,
 			   const struct xattr *xattr_array, void *fs_data);
 
+/* init a security_capable_opts struct with default values */
+static inline void init_security_capable_opts(struct security_capable_opts* opts)
+{
+	opts->log_audit_message = true;
+	opts->in_setid = false;
+}
+
 #ifdef CONFIG_SECURITY
 
 struct security_mnt_opts {
@@ -212,10 +226,10 @@ int security_capset(struct cred *new, const struct cred *old,
 		    const kernel_cap_t *effective,
 		    const kernel_cap_t *inheritable,
 		    const kernel_cap_t *permitted);
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-			int cap);
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap);
+int security_capable(const struct cred *cred,
+			struct user_namespace *ns,
+			int cap,
+			struct security_capable_opts *opts);
 int security_quotactl(int cmds, int type, int id, struct super_block *sb);
 int security_quota_on(struct dentry *dentry);
 int security_syslog(int type);
@@ -470,14 +484,11 @@ static inline int security_capset(struct cred *new,
 }
 
 static inline int security_capable(const struct cred *cred,
-				   struct user_namespace *ns, int cap)
+				   struct user_namespace *ns,
+				   int cap,
+				   struct security_capable_opts *opts)
 {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-static inline int security_capable_noaudit(const struct cred *cred,
-					   struct user_namespace *ns, int cap) {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return cap_capable(cred, ns, cap, opts);
 }
 
 static inline int security_quotactl(int cmds, int type, int id,
diff --git a/kernel/capability.c b/kernel/capability.c
index 1e1c0236f55b..d8ff27e6e7c4 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -297,9 +297,12 @@ bool has_ns_capability(struct task_struct *t,
 		       struct user_namespace *ns, int cap)
 {
 	int ret;
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
 
 	rcu_read_lock();
-	ret = security_capable(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, &opts);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -338,9 +341,13 @@ bool has_ns_capability_noaudit(struct task_struct *t,
 			       struct user_namespace *ns, int cap)
 {
 	int ret;
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
 
 	rcu_read_lock();
-	ret = security_capable_noaudit(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, &opts);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -363,7 +370,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
 	return has_ns_capability_noaudit(t, &init_user_ns, cap);
 }
 
-static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
+static bool ns_capable_common(struct user_namespace *ns,
+			      int cap,
+			      struct security_capable_opts *opts)
 {
 	int capable;
 
@@ -372,8 +381,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
 		BUG();
 	}
 
-	capable = audit ? security_capable(current_cred(), ns, cap) :
-			  security_capable_noaudit(current_cred(), ns, cap);
+	capable = security_capable(current_cred(), ns, cap, opts);
 	if (capable == 0) {
 		current->flags |= PF_SUPERPRIV;
 		return true;
@@ -394,7 +402,10 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
  */
 bool ns_capable(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, true);
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	return ns_capable_common(ns, cap, &opts);
 }
 EXPORT_SYMBOL(ns_capable);
 
@@ -412,7 +423,11 @@ EXPORT_SYMBOL(ns_capable);
  */
 bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, false);
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
+	return ns_capable_common(ns, cap, &opts);
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
@@ -448,10 +463,13 @@ EXPORT_SYMBOL(capable);
 bool file_ns_capable(const struct file *file, struct user_namespace *ns,
 		     int cap)
 {
+	struct security_capable_opts opts;
+
 	if (WARN_ON_ONCE(!cap_valid(cap)))
 		return false;
 
-	if (security_capable(file->f_cred, ns, cap) == 0)
+	init_security_capable_opts(&opts);
+	if (security_capable(file->f_cred, ns, cap, &opts) == 0)
 		return true;
 
 	return false;
@@ -500,10 +518,15 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
 {
 	int ret = 0;  /* An absent tracer adds no restrictions */
 	const struct cred *cred;
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
+
 	rcu_read_lock();
 	cred = rcu_dereference(tsk->ptracer_cred);
 	if (cred)
-		ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
+		ret = security_capable(cred, ns, CAP_SYS_PTRACE, &opts);
 	rcu_read_unlock();
 	return (ret == 0);
 }
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index fd023ac24e10..4b3c543a3cc1 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -370,12 +370,15 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 	struct seccomp_filter *sfilter;
 	int ret;
 	const bool save_orig = IS_ENABLED(CONFIG_CHECKPOINT_RESTORE);
+	struct security_capable_opts opts;
 
 	if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
 		return ERR_PTR(-EINVAL);
 
 	BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
 
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
 	/*
 	 * Installing a seccomp filter requires that the task has
 	 * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
@@ -383,8 +386,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 	 * behavior of privileged children.
 	 */
 	if (!task_no_new_privs(current) &&
-	    security_capable_noaudit(current_cred(), current_user_ns(),
-				     CAP_SYS_ADMIN) != 0)
+	    security_capable(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN, &opts) != 0)
 		return ERR_PTR(-EACCES);
 
 	/* Allocate a new seccomp_filter */
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 74f17376202b..040dcdcce9a7 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -174,14 +174,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
 }
 
 static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
-			    int cap, int audit)
+			    int cap, struct security_capable_opts *opts)
 {
 	struct aa_label *label;
 	int error = 0;
 
 	label = aa_get_newest_cred_label(cred);
 	if (!unconfined(label))
-		error = aa_capable(label, cap, audit);
+		error = aa_capable(label, cap, opts->log_audit_message);
 	aa_put_label(label);
 
 	return error;
diff --git a/security/commoncap.c b/security/commoncap.c
index f4c33abd9959..a66ad899b0d7 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -69,7 +69,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
  * kernel's capable() and has_capability() returns 1 for this case.
  */
 int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
-		int cap, int audit)
+		int cap, struct security_capable_opts *opts)
 {
 	struct user_namespace *ns = targ_ns;
 
@@ -223,12 +223,14 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
  */
 static inline int cap_inh_is_capped(void)
 {
+	struct security_capable_opts opts;
 
+	init_security_capable_opts(&opts);
 	/* they are so limited unless the current task has the CAP_SETPCAP
 	 * capability
 	 */
 	if (cap_capable(current_cred(), current_cred()->user_ns,
-			CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
+			CAP_SETPCAP, &opts) == 0)
 		return 0;
 	return 1;
 }
@@ -1177,6 +1179,7 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 {
 	const struct cred *old = current_cred();
 	struct cred *new;
+	struct security_capable_opts opts;
 
 	switch (option) {
 	case PR_CAPBSET_READ:
@@ -1207,13 +1210,15 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 	 * capability-based-privilege environment.
 	 */
 	case PR_SET_SECUREBITS:
+		init_security_capable_opts(&opts);
 		if ((((old->securebits & SECURE_ALL_LOCKS) >> 1)
 		     & (old->securebits ^ arg2))			/*[1]*/
 		    || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))	/*[2]*/
 		    || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))	/*[3]*/
 		    || (cap_capable(current_cred(),
-				    current_cred()->user_ns, CAP_SETPCAP,
-				    SECURITY_CAP_AUDIT) != 0)		/*[4]*/
+				    current_cred()->user_ns,
+				    CAP_SETPCAP,
+				    &opts) != 0)			/*[4]*/
 			/*
 			 * [1] no changing of bits that are locked
 			 * [2] no unlocking of locks
@@ -1307,10 +1312,14 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 {
 	int cap_sys_admin = 0;
+	struct security_capable_opts opts;
 
-	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
-			SECURITY_CAP_NOAUDIT) == 0)
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
+
+	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN, &opts) == 0)
 		cap_sys_admin = 1;
+
 	return cap_sys_admin;
 }
 
@@ -1326,10 +1335,12 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 int cap_mmap_addr(unsigned long addr)
 {
 	int ret = 0;
+	struct security_capable_opts opts;
 
+	init_security_capable_opts(&opts);
 	if (addr < dac_mmap_min_addr) {
-		ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
-				  SECURITY_CAP_AUDIT);
+		ret = cap_capable(current_cred(), &init_user_ns,
+						CAP_SYS_RAWIO, &opts);
 		/* set PF_SUPERPRIV if it turns out we allow the low mmap */
 		if (ret == 0)
 			current->flags |= PF_SUPERPRIV;
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
index 32040f8db7ce..6ad109658697 100644
--- a/security/safesetid/lsm.c
+++ b/security/safesetid/lsm.c
@@ -146,7 +146,7 @@ static bool setuid_syscall(int num)
 static int safesetid_security_capable(const struct cred *cred,
 				      struct user_namespace *ns,
 				      int cap,
-				      int audit)
+				      struct security_capable_opts *opts)
 {
 	/* The current->mm check will fail if this is a kernel thread. */
 	if (cap == CAP_SETUID &&
diff --git a/security/security.c b/security/security.c
index 68f46d849abe..526dae29fd37 100644
--- a/security/security.c
+++ b/security/security.c
@@ -278,16 +278,12 @@ int security_capset(struct cred *new, const struct cred *old,
 				effective, inheritable, permitted);
 }
 
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-		     int cap)
+int security_capable(const struct cred *cred,
+		     struct user_namespace *ns,
+		     int cap,
+		     struct security_capable_opts *opts)
 {
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap)
-{
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return call_int_hook(capable, 0, cred, ns, cap, opts);
 }
 
 int security_quotactl(int cmds, int type, int id, struct super_block *sb)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 2b5ee5fbd652..ed409cd70822 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2313,9 +2313,10 @@ static int selinux_capset(struct cred *new, const struct cred *old,
  */
 
 static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
-			   int cap, int audit)
+			   int cap, struct security_capable_opts *opts)
 {
-	return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
+	return cred_has_capability(cred, cap, opts->log_audit_message,
+							ns == &init_user_ns);
 }
 
 static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
@@ -3242,11 +3243,13 @@ static int selinux_inode_getattr(const struct path *path)
 static bool has_cap_mac_admin(bool audit)
 {
 	const struct cred *cred = current_cred();
-	int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
+	struct security_capable_opts opts;
 
-	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = audit ? true : false;
+	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, &opts))
 		return false;
-	if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
+	if (cred_has_capability(cred, CAP_MAC_ADMIN, opts.log_audit_message, true))
 		return false;
 	return true;
 }
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 9a4c0ad46518..eca364b697d7 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -639,8 +639,11 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
 	struct smack_known *skp = tsp->smk_task;
 	struct smack_known_list_elem *sklep;
 	int rc;
+	struct security_capable_opts opts;
 
-	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
+	init_security_capable_opts(&opts);
+
+	rc = cap_capable(cred, &init_user_ns, cap, &opts);
 	if (rc)
 		return false;
 
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH] [PATCH] LSM: generalize flag passing to security_capable
  2018-11-02 18:07 ` [PATCH] " Stephen Smalley
  2018-11-02 19:13   ` Micah Morton
@ 2018-11-19 18:54   ` mortonm
  2018-12-13 22:29     ` Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: mortonm @ 2018-11-19 18:54 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM). This generalization could also support
passing down the inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that
authorization could happen on a per-file basis for specific files rather
than all or nothing.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---

Developed against the 'next-general' branch.

@Stephen: is this the approach you had in mind for modifying the
callers of ns_capable?

 include/linux/lsm_hooks.h     |  8 ++++---
 include/linux/security.h      | 35 ++++++++++++++++++++----------
 kernel/capability.c           | 41 +++++++++++++++++++++++++++--------
 kernel/seccomp.c              |  7 ++++--
 security/apparmor/lsm.c       |  4 ++--
 security/commoncap.c          | 27 ++++++++++++++++-------
 security/security.c           | 14 +++++-------
 security/selinux/hooks.c      | 13 ++++++-----
 security/smack/smack_access.c |  5 ++++-
 9 files changed, 103 insertions(+), 51 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index aaeb7fa24dc4..02422592cc83 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1270,7 +1270,7 @@
  *	@cred contains the credentials to use.
  *	@ns contains the user namespace we want the capability in
  *	@cap contains the capability <include/linux/capability.h>.
- *	@audit contains whether to write an audit message or not
+ *	@opts contains options for the capable check <include/linux/security.h>
  *	Return 0 if the capability is granted for @tsk.
  * @syslog:
  *	Check permission before accessing the kernel message ring or changing
@@ -1446,8 +1446,10 @@ union security_list_options {
 			const kernel_cap_t *effective,
 			const kernel_cap_t *inheritable,
 			const kernel_cap_t *permitted);
-	int (*capable)(const struct cred *cred, struct user_namespace *ns,
-			int cap, int audit);
+	int (*capable)(const struct cred *cred,
+			struct user_namespace *ns,
+			int cap,
+			struct security_capable_opts *opts);
 	int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
 	int (*quota_on)(struct dentry *dentry);
 	int (*syslog)(int type);
diff --git a/include/linux/security.h b/include/linux/security.h
index d170a5b031f3..b60621e93faf 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -58,6 +58,13 @@ struct mm_struct;
 #define SECURITY_CAP_NOAUDIT 0
 #define SECURITY_CAP_AUDIT 1
 
+struct security_capable_opts {
+       /* If capable should audit the security request */
+       bool log_audit_message;
+       /* If capable is being called from a setid syscall */
+       bool in_setid;
+};
+
 /* LSM Agnostic defines for sb_set_mnt_opts */
 #define SECURITY_LSM_NATIVE_LABELS	1
 
@@ -72,7 +79,7 @@ enum lsm_event {
 
 /* These functions are in security/commoncap.c */
 extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
-		       int cap, int audit);
+		       int cap, struct security_capable_opts *opts);
 extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
 extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
@@ -180,6 +187,13 @@ static inline const char *kernel_load_data_id_str(enum kernel_load_data_id id)
 	return kernel_load_data_str[id];
 }
 
+/* init a security_capable_opts struct with default values */
+static inline void init_security_capable_opts(struct security_capable_opts* opts)
+{
+       opts->log_audit_message = true;
+       opts->in_setid = false;
+}
+
 #ifdef CONFIG_SECURITY
 
 struct security_mnt_opts {
@@ -233,10 +247,10 @@ int security_capset(struct cred *new, const struct cred *old,
 		    const kernel_cap_t *effective,
 		    const kernel_cap_t *inheritable,
 		    const kernel_cap_t *permitted);
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-			int cap);
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap);
+int security_capable(const struct cred *cred,
+		       struct user_namespace *ns,
+		       int cap,
+		       struct security_capable_opts *opts);
 int security_quotactl(int cmds, int type, int id, struct super_block *sb);
 int security_quota_on(struct dentry *dentry);
 int security_syslog(int type);
@@ -492,14 +506,11 @@ static inline int security_capset(struct cred *new,
 }
 
 static inline int security_capable(const struct cred *cred,
-				   struct user_namespace *ns, int cap)
+				   struct user_namespace *ns,
+				   int cap,
+				   struct security_capable_opts *opts)
 {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-static inline int security_capable_noaudit(const struct cred *cred,
-					   struct user_namespace *ns, int cap) {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return cap_capable(cred, ns, cap, opts);
 }
 
 static inline int security_quotactl(int cmds, int type, int id,
diff --git a/kernel/capability.c b/kernel/capability.c
index 1e1c0236f55b..d8ff27e6e7c4 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -297,9 +297,12 @@ bool has_ns_capability(struct task_struct *t,
 		       struct user_namespace *ns, int cap)
 {
 	int ret;
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
 
 	rcu_read_lock();
-	ret = security_capable(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, &opts);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -338,9 +341,13 @@ bool has_ns_capability_noaudit(struct task_struct *t,
 			       struct user_namespace *ns, int cap)
 {
 	int ret;
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
 
 	rcu_read_lock();
-	ret = security_capable_noaudit(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, &opts);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -363,7 +370,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
 	return has_ns_capability_noaudit(t, &init_user_ns, cap);
 }
 
-static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
+static bool ns_capable_common(struct user_namespace *ns,
+			      int cap,
+			      struct security_capable_opts *opts)
 {
 	int capable;
 
@@ -372,8 +381,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
 		BUG();
 	}
 
-	capable = audit ? security_capable(current_cred(), ns, cap) :
-			  security_capable_noaudit(current_cred(), ns, cap);
+	capable = security_capable(current_cred(), ns, cap, opts);
 	if (capable == 0) {
 		current->flags |= PF_SUPERPRIV;
 		return true;
@@ -394,7 +402,10 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
  */
 bool ns_capable(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, true);
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	return ns_capable_common(ns, cap, &opts);
 }
 EXPORT_SYMBOL(ns_capable);
 
@@ -412,7 +423,11 @@ EXPORT_SYMBOL(ns_capable);
  */
 bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, false);
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
+	return ns_capable_common(ns, cap, &opts);
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
@@ -448,10 +463,13 @@ EXPORT_SYMBOL(capable);
 bool file_ns_capable(const struct file *file, struct user_namespace *ns,
 		     int cap)
 {
+	struct security_capable_opts opts;
+
 	if (WARN_ON_ONCE(!cap_valid(cap)))
 		return false;
 
-	if (security_capable(file->f_cred, ns, cap) == 0)
+	init_security_capable_opts(&opts);
+	if (security_capable(file->f_cred, ns, cap, &opts) == 0)
 		return true;
 
 	return false;
@@ -500,10 +518,15 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
 {
 	int ret = 0;  /* An absent tracer adds no restrictions */
 	const struct cred *cred;
+	struct security_capable_opts opts;
+
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
+
 	rcu_read_lock();
 	cred = rcu_dereference(tsk->ptracer_cred);
 	if (cred)
-		ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
+		ret = security_capable(cred, ns, CAP_SYS_PTRACE, &opts);
 	rcu_read_unlock();
 	return (ret == 0);
 }
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f2ae2324c232..eed0e34c1bc2 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -370,12 +370,15 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 	struct seccomp_filter *sfilter;
 	int ret;
 	const bool save_orig = IS_ENABLED(CONFIG_CHECKPOINT_RESTORE);
+	struct security_capable_opts opts;
 
 	if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
 		return ERR_PTR(-EINVAL);
 
 	BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
 
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
 	/*
 	 * Installing a seccomp filter requires that the task has
 	 * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
@@ -383,8 +386,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 	 * behavior of privileged children.
 	 */
 	if (!task_no_new_privs(current) &&
-	    security_capable_noaudit(current_cred(), current_user_ns(),
-				     CAP_SYS_ADMIN) != 0)
+	    security_capable(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN, &opts) != 0)
 		return ERR_PTR(-EACCES);
 
 	/* Allocate a new seccomp_filter */
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 42446a216f3b..3be87dfd5e57 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
 }
 
 static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
-			    int cap, int audit)
+			    int cap, struct security_capable_opts *opts)
 {
 	struct aa_label *label;
 	int error = 0;
 
 	label = aa_get_newest_cred_label(cred);
 	if (!unconfined(label))
-		error = aa_capable(label, cap, audit);
+		error = aa_capable(label, cap, opts->log_audit_message);
 	aa_put_label(label);
 
 	return error;
diff --git a/security/commoncap.c b/security/commoncap.c
index 18a4fdf6f6eb..93fbb0dd70d6 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -69,7 +69,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
  * kernel's capable() and has_capability() returns 1 for this case.
  */
 int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
-		int cap, int audit)
+		int cap, struct security_capable_opts *opts)
 {
 	struct user_namespace *ns = targ_ns;
 
@@ -223,12 +223,14 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
  */
 static inline int cap_inh_is_capped(void)
 {
+	struct security_capable_opts opts;
 
+	init_security_capable_opts(&opts);
 	/* they are so limited unless the current task has the CAP_SETPCAP
 	 * capability
 	 */
 	if (cap_capable(current_cred(), current_cred()->user_ns,
-			CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
+			CAP_SETPCAP, &opts) == 0)
 		return 0;
 	return 1;
 }
@@ -1174,6 +1176,7 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 {
 	const struct cred *old = current_cred();
 	struct cred *new;
+	struct security_capable_opts opts;
 
 	switch (option) {
 	case PR_CAPBSET_READ:
@@ -1204,13 +1207,15 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 	 * capability-based-privilege environment.
 	 */
 	case PR_SET_SECUREBITS:
+		init_security_capable_opts(&opts);
 		if ((((old->securebits & SECURE_ALL_LOCKS) >> 1)
 		     & (old->securebits ^ arg2))			/*[1]*/
 		    || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))	/*[2]*/
 		    || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))	/*[3]*/
 		    || (cap_capable(current_cred(),
-				    current_cred()->user_ns, CAP_SETPCAP,
-				    SECURITY_CAP_AUDIT) != 0)		/*[4]*/
+				    current_cred()->user_ns,
+				    CAP_SETPCAP,
+				    &opts) != 0)			/*[4]*/
 			/*
 			 * [1] no changing of bits that are locked
 			 * [2] no unlocking of locks
@@ -1304,10 +1309,14 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 {
 	int cap_sys_admin = 0;
+	struct security_capable_opts opts;
 
-	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
-			SECURITY_CAP_NOAUDIT) == 0)
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = false;
+
+	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN, &opts) == 0)
 		cap_sys_admin = 1;
+
 	return cap_sys_admin;
 }
 
@@ -1323,10 +1332,12 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 int cap_mmap_addr(unsigned long addr)
 {
 	int ret = 0;
+	struct security_capable_opts opts;
 
+        init_security_capable_opts(&opts);
 	if (addr < dac_mmap_min_addr) {
-		ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
-				  SECURITY_CAP_AUDIT);
+		ret = cap_capable(current_cred(), &init_user_ns,
+						CAP_SYS_RAWIO, &opts);
 		/* set PF_SUPERPRIV if it turns out we allow the low mmap */
 		if (ret == 0)
 			current->flags |= PF_SUPERPRIV;
diff --git a/security/security.c b/security/security.c
index 04d173eb93f6..bbc400a90c34 100644
--- a/security/security.c
+++ b/security/security.c
@@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
 				effective, inheritable, permitted);
 }
 
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-		     int cap)
+int security_capable(const struct cred *cred,
+		     struct user_namespace *ns,
+		     int cap,
+		     struct security_capable_opts *opts)
 {
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap)
-{
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return call_int_hook(capable, 0, cred, ns, cap, opts);
 }
 
 int security_quotactl(int cmds, int type, int id, struct super_block *sb)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 7ce683259357..ebd36adc8856 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2316,9 +2316,10 @@ static int selinux_capset(struct cred *new, const struct cred *old,
  */
 
 static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
-			   int cap, int audit)
+			   int cap, struct security_capable_opts *opts)
 {
-	return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
+	return cred_has_capability(cred, cap, opts->log_audit_message,
+							ns == &init_user_ns);
 }
 
 static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
@@ -3245,11 +3246,13 @@ static int selinux_inode_getattr(const struct path *path)
 static bool has_cap_mac_admin(bool audit)
 {
 	const struct cred *cred = current_cred();
-	int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
+	struct security_capable_opts opts;
 
-	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
+	init_security_capable_opts(&opts);
+	opts.log_audit_message = audit ? true : false;
+	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, &opts))
 		return false;
-	if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
+	if (cred_has_capability(cred, CAP_MAC_ADMIN, opts.log_audit_message, true))
 		return false;
 	return true;
 }
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 9a4c0ad46518..eca364b697d7 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -639,8 +639,11 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
 	struct smack_known *skp = tsp->smk_task;
 	struct smack_known_list_elem *sklep;
 	int rc;
+	struct security_capable_opts opts;
 
-	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
+	init_security_capable_opts(&opts);
+
+	rc = cap_capable(cred, &init_user_ns, cap, &opts);
 	if (rc)
 		return false;
 
-- 
2.19.1.1215.g8438c0b245-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-09  0:30                           ` Micah Morton
  2018-11-09 23:21                             ` [PATCH] LSM: generalize flag passing to security_capable mortonm
@ 2018-11-21 16:54                             ` mortonm
  2018-12-06  0:08                               ` Kees Cook
  1 sibling, 1 reply; 88+ messages in thread
From: mortonm @ 2018-11-21 16:54 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 29774 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---

Sending a patch developed against the 'next-general' branch of the
linux-security tree, since the previous patch versions wouldn't apply
cleanly to 'next-general'.

 Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 arch/Kconfig                                |   5 +
 arch/arm/Kconfig                            |   1 +
 arch/arm64/Kconfig                          |   1 +
 arch/x86/Kconfig                            |   1 +
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  13 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 342 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 ++
 security/safesetid/securityfs.c             | 189 +++++++++++
 13 files changed, 700 insertions(+)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..ffb64be67f7a
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,107 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+The main use case for this LSM is to allow a non-root program to transition to
+other untrusted uids without full blown CAP_SETUID capabilities. The non-root
+program would still need CAP_SETUID to do any kind of transition, but the
+additional restrictions imposed by this LSM would mean it is a "safer" version
+of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
+do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
+namespace). The higher level goal is to allow for uid-based sandboxing of system
+services without having to give out CAP_SETUID all over the place just so that
+non-root programs can drop to even-lesser-privileged uids. This is especially
+relevant when one non-root daemon on the system should be allowed to spawn other
+processes as different uids, but its undesirable to give the daemon a
+basically-root-equivalent CAP_SETUID.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index c980dfe9abf1..a0c387649e12 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -39,3 +39,4 @@ the one "major" module (e.g. SELinux) if there is one configured.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/arch/Kconfig b/arch/Kconfig
index e1e540ffa979..510575004ecb 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -398,6 +398,11 @@ config ARCH_WEAK_RELEASE_ACQUIRE
 config ARCH_WANT_IPC_PARSE_VERSION
 	bool
 
+config HAVE_SAFESETID
+	bool
+	help
+	  This option enables the SafeSetID LSM.
+
 config ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	bool
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 91be74d8df65..0f843fa68980 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -92,6 +92,7 @@ config ARM
 	select HAVE_RCU_TABLE_FREE if (SMP && ARM_LPAE)
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RSEQ
+	select HAVE_SAFESETID
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UID16
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 787d7850e064..3d69949da5ae 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -147,6 +147,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select HAVE_RCU_TABLE_FREE
+	select HAVE_SAFESETID
 	select HAVE_RCU_TABLE_INVALIDATE
 	select HAVE_RSEQ
 	select HAVE_STACKPROTECTOR
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9d734f3c8234..978ab49e0e1c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86_64
 	select ARCH_SUPPORTS_INT128
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select HAVE_ARCH_SOFT_DIRTY
+	select HAVE_SAFESETID
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE
 	select SWIOTLB
diff --git a/security/Kconfig b/security/Kconfig
index d9aa521b5206..d80a663d2753 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -236,6 +236,7 @@ source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/loadpin/Kconfig
 source security/yama/Kconfig
+source security/safesetid/Kconfig
 
 source security/integrity/Kconfig
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..c598b904938f 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..4ff82c7ed273
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,13 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        depends on HAVE_SAFESETID
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..6ad109658697
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,342 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+/*
+ * TODO: Figuring out whether the current syscall number (saved on the kernel
+ * stack) is one of the set*uid syscalls is an operation that requires checking
+ * the number against arch-specific constants as seen below. The need for this
+ * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
+ * implement an arch-specific function that gets called from this file and
+ * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
+ * selected for architectures that implement the function? Any other ideas?
+ */
+static bool setuid_syscall(int num)
+{
+#ifdef CONFIG_X86_64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_ia32_setreuid32 ||
+	      num == __NR_ia32_setuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setresuid ||
+	      num == __NR_ia32_setuid32))
+		return false;
+#else
+	if (!(num == __NR_setreuid ||
+	      num == __NR_setuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setfsuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM64
+#ifdef CONFIG_COMPAT
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid ||
+	      num == __NR_setreuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setfsuid32 ||
+	      num == __NR_compat_setuid ||
+	      num == __NR_compat_setreuid ||
+	      num == __NR_compat_setfsuid ||
+	      num == __NR_compat_setresuid ||
+	      num == __NR_compat_setreuid32 ||
+	      num == __NR_compat_setresuid32 ||
+	      num == __NR_compat_setuid32 ||
+	      num == __NR_compat_setfsuid32))
+		return false;
+#else
+	if (!(num == __NR_setuid ||
+	      num == __NR_setreuid ||
+	      num == __NR_setfsuid ||
+	      num == __NR_setresuid))
+		return false;
+#endif /* CONFIG_COMPAT */
+#elif defined CONFIG_ARM
+	if (!(num == __NR_setreuid32 ||
+	      num == __NR_setuid32 ||
+	      num == __NR_setresuid32 ||
+	      num == __NR_setfsuid32))
+		return false;
+#else
+	BUILD_BUG();
+#endif
+	return true;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      struct security_capable_opts *opts)
+{
+	/* The current->mm check will fail if this is a kernel thread. */
+	if (cap == CAP_SETUID &&
+	    current->mm &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		/*
+		 * syscall_get_nr can theoretically return 0 or -1, but that
+		 * would signify that the syscall is being aborted due to a
+		 * signal, so we don't need to check for this case here.
+		 */
+		if (!(setuid_syscall(syscall_get_nr(current,
+						    current_pt_regs()))))
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			return -1;
+	}
+	return 0;
+}
+
+static void setuid_policy_warning(kuid_t parent, kuid_t child)
+{
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+        /*
+         * Kill this process to avoid potential security vulnerabilities
+         * that could arise from a missing whitelist entry preventing a
+         * privileged process from dropping to a lesser-privileged one.
+         */
+        do_exit(SIGKILL);
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	setuid_policy_warning(parent, child);
+	return -1;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+security_initcall(safesetid_security_init);
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..ff5fcf2c1b37
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
+		flush_safesetid_whitelist_entries();
+		return len;
+	}
+
+	/*
+	 * If we get to here, must be the case that file_entry->type equals
+	 * SAFESETID_WHITELIST_ADD
+	 */
+	ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+							 &child);
+	if (ret)
+		return ret;
+
+	ret = add_safesetid_whitelist_entry(parent, child);
+	if (ret)
+		return ret;
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.19.1.1215.g8438c0b245-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-11-21 16:54                             ` [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
@ 2018-12-06  0:08                               ` Kees Cook
  2018-12-06 17:51                                 ` Micah Morton
  2019-01-11 17:13                                 ` [PATCH v2] " mortonm
  0 siblings, 2 replies; 88+ messages in thread
From: Kees Cook @ 2018-12-06  0:08 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Wed, Nov 21, 2018 at 8:54 AM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
>
> Sending a patch developed against the 'next-general' branch of the
> linux-security tree, since the previous patch versions wouldn't apply
> cleanly to 'next-general'.

I'm finally getting back around to this. Sorry for the delay!

A few general process notes:
- Please "version" your patches in the Subject (e.g. "[PATCH v3] LSM:
add SafeSetID ..."). This helps track discussion.
- Please include a "changes since last version below the first "---"
line, to summarize what has changed. This makes review faster for
people that have read a specific version but need to catch up (like
me) :)

> +/*
> + * TODO: Figuring out whether the current syscall number (saved on the kernel
> + * stack) is one of the set*uid syscalls is an operation that requires checking
> + * the number against arch-specific constants as seen below. The need for this
> + * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
> + * implement an arch-specific function that gets called from this file and
> + * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
> + * selected for architectures that implement the function? Any other ideas?
> + */

What would Stephen's solution for this problem end up looking like? I
think avoiding the arch-specific-ness would be quite valuable.

I think adding a capability for this isn't the way to go (there is a
very painful history on adding capabilities). This feels much more
like a good mapping to an LSM (it's narrowing a privilege) with a very
specific policy.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: add SafeSetID module that gates setid calls
  2018-12-06  0:08                               ` Kees Cook
@ 2018-12-06 17:51                                 ` Micah Morton
  2019-01-11 17:13                                 ` [PATCH v2] " mortonm
  1 sibling, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-12-06 17:51 UTC (permalink / raw)
  To: Kees Cook; +Cc: jmorris, serge, casey, sds, linux-security-module

On Wed, Dec 5, 2018 at 4:08 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Wed, Nov 21, 2018 at 8:54 AM <mortonm@chromium.org> wrote:
> >
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> >
> > Sending a patch developed against the 'next-general' branch of the
> > linux-security tree, since the previous patch versions wouldn't apply
> > cleanly to 'next-general'.
>
> I'm finally getting back around to this. Sorry for the delay!
>
> A few general process notes:
> - Please "version" your patches in the Subject (e.g. "[PATCH v3] LSM:
> add SafeSetID ..."). This helps track discussion.
> - Please include a "changes since last version below the first "---"
> line, to summarize what has changed. This makes review faster for
> people that have read a specific version but need to catch up (like
> me) :)

Ok thanks, will do in the future. The only code change since the
initial upload was to add a do_exit(SIGKILL) line to
setuid_policy_warning() in lsm.c, which will kill any process that
violates the whitelist policy. This way, there can never be a case
where a privileged program fails to drop privilege because of our
whitelist and continues running in an accidentally over-privileged
context.

>
> > +/*
> > + * TODO: Figuring out whether the current syscall number (saved on the kernel
> > + * stack) is one of the set*uid syscalls is an operation that requires checking
> > + * the number against arch-specific constants as seen below. The need for this
> > + * LSM to know about arch-specific syscall stuff is not ideal. Is it better to
> > + * implement an arch-specific function that gets called from this file and
> > + * update arch/Kconfig to mention that the HAVE_SAFESETID symbol should only be
> > + * selected for architectures that implement the function? Any other ideas?
> > + */
>
> What would Stephen's solution for this problem end up looking like? I
> think avoiding the arch-specific-ness would be quite valuable.

I sent a patch here that is an example of how it could be done:
https://www.spinics.net/lists/linux-security-module/msg24504.html.
AFAICT I think this is what Stephen had in mind.

>
> I think adding a capability for this isn't the way to go (there is a
> very painful history on adding capabilities). This feels much more
> like a good mapping to an LSM (it's narrowing a privilege) with a very
> specific policy.
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] [PATCH] LSM: generalize flag passing to security_capable
  2018-11-19 18:54   ` [PATCH] [PATCH] LSM: generalize flag passing to security_capable mortonm
@ 2018-12-13 22:29     ` Micah Morton
  2018-12-13 23:09       ` Casey Schaufler
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2018-12-13 22:29 UTC (permalink / raw)
  To: jmorris, serge, Kees Cook, casey, sds, linux-security-module

Any comments on this patch? If not, could it get merged at some point?

I booted a Linux kernel with the changes compiled in and verified with
print statements that the code works properly AFAICT.
On Mon, Nov 19, 2018 at 10:54 AM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> This patch provides a general mechanism for passing flags to the
> security_capable LSM hook. It replaces the specific 'audit' flag that is
> used to tell security_capable whether it should log an audit message for
> the given capability check. The reason for generalizing this flag
> passing is so we can add an additional flag that signifies whether
> security_capable is being called by a setid syscall (which is needed by
> the proposed SafeSetID LSM). This generalization could also support
> passing down the inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that
> authorization could happen on a per-file basis for specific files rather
> than all or nothing.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
>
> Developed against the 'next-general' branch.
>
> @Stephen: is this the approach you had in mind for modifying the
> callers of ns_capable?
>
>  include/linux/lsm_hooks.h     |  8 ++++---
>  include/linux/security.h      | 35 ++++++++++++++++++++----------
>  kernel/capability.c           | 41 +++++++++++++++++++++++++++--------
>  kernel/seccomp.c              |  7 ++++--
>  security/apparmor/lsm.c       |  4 ++--
>  security/commoncap.c          | 27 ++++++++++++++++-------
>  security/security.c           | 14 +++++-------
>  security/selinux/hooks.c      | 13 ++++++-----
>  security/smack/smack_access.c |  5 ++++-
>  9 files changed, 103 insertions(+), 51 deletions(-)
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index aaeb7fa24dc4..02422592cc83 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1270,7 +1270,7 @@
>   *     @cred contains the credentials to use.
>   *     @ns contains the user namespace we want the capability in
>   *     @cap contains the capability <include/linux/capability.h>.
> - *     @audit contains whether to write an audit message or not
> + *     @opts contains options for the capable check <include/linux/security.h>
>   *     Return 0 if the capability is granted for @tsk.
>   * @syslog:
>   *     Check permission before accessing the kernel message ring or changing
> @@ -1446,8 +1446,10 @@ union security_list_options {
>                         const kernel_cap_t *effective,
>                         const kernel_cap_t *inheritable,
>                         const kernel_cap_t *permitted);
> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> -                       int cap, int audit);
> +       int (*capable)(const struct cred *cred,
> +                       struct user_namespace *ns,
> +                       int cap,
> +                       struct security_capable_opts *opts);
>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>         int (*quota_on)(struct dentry *dentry);
>         int (*syslog)(int type);
> diff --git a/include/linux/security.h b/include/linux/security.h
> index d170a5b031f3..b60621e93faf 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -58,6 +58,13 @@ struct mm_struct;
>  #define SECURITY_CAP_NOAUDIT 0
>  #define SECURITY_CAP_AUDIT 1
>
> +struct security_capable_opts {
> +       /* If capable should audit the security request */
> +       bool log_audit_message;
> +       /* If capable is being called from a setid syscall */
> +       bool in_setid;
> +};
> +
>  /* LSM Agnostic defines for sb_set_mnt_opts */
>  #define SECURITY_LSM_NATIVE_LABELS     1
>
> @@ -72,7 +79,7 @@ enum lsm_event {
>
>  /* These functions are in security/commoncap.c */
>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> -                      int cap, int audit);
> +                      int cap, struct security_capable_opts *opts);
>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>  extern int cap_ptrace_traceme(struct task_struct *parent);
> @@ -180,6 +187,13 @@ static inline const char *kernel_load_data_id_str(enum kernel_load_data_id id)
>         return kernel_load_data_str[id];
>  }
>
> +/* init a security_capable_opts struct with default values */
> +static inline void init_security_capable_opts(struct security_capable_opts* opts)
> +{
> +       opts->log_audit_message = true;
> +       opts->in_setid = false;
> +}
> +
>  #ifdef CONFIG_SECURITY
>
>  struct security_mnt_opts {
> @@ -233,10 +247,10 @@ int security_capset(struct cred *new, const struct cred *old,
>                     const kernel_cap_t *effective,
>                     const kernel_cap_t *inheritable,
>                     const kernel_cap_t *permitted);
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                       int cap);
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap);
> +int security_capable(const struct cred *cred,
> +                      struct user_namespace *ns,
> +                      int cap,
> +                      struct security_capable_opts *opts);
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>  int security_quota_on(struct dentry *dentry);
>  int security_syslog(int type);
> @@ -492,14 +506,11 @@ static inline int security_capset(struct cred *new,
>  }
>
>  static inline int security_capable(const struct cred *cred,
> -                                  struct user_namespace *ns, int cap)
> +                                  struct user_namespace *ns,
> +                                  int cap,
> +                                  struct security_capable_opts *opts)
>  {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -static inline int security_capable_noaudit(const struct cred *cred,
> -                                          struct user_namespace *ns, int cap) {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return cap_capable(cred, ns, cap, opts);
>  }
>
>  static inline int security_quotactl(int cmds, int type, int id,
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 1e1c0236f55b..d8ff27e6e7c4 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -297,9 +297,12 @@ bool has_ns_capability(struct task_struct *t,
>                        struct user_namespace *ns, int cap)
>  {
>         int ret;
> +       struct security_capable_opts opts;
> +
> +       init_security_capable_opts(&opts);
>
>         rcu_read_lock();
> -       ret = security_capable(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, &opts);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -338,9 +341,13 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>                                struct user_namespace *ns, int cap)
>  {
>         int ret;
> +       struct security_capable_opts opts;
> +
> +       init_security_capable_opts(&opts);
> +       opts.log_audit_message = false;
>
>         rcu_read_lock();
> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, &opts);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -363,7 +370,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>  }
>
> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> +static bool ns_capable_common(struct user_namespace *ns,
> +                             int cap,
> +                             struct security_capable_opts *opts)
>  {
>         int capable;
>
> @@ -372,8 +381,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>                 BUG();
>         }
>
> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> -                         security_capable_noaudit(current_cred(), ns, cap);
> +       capable = security_capable(current_cred(), ns, cap, opts);
>         if (capable == 0) {
>                 current->flags |= PF_SUPERPRIV;
>                 return true;
> @@ -394,7 +402,10 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>   */
>  bool ns_capable(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, true);
> +       struct security_capable_opts opts;
> +
> +       init_security_capable_opts(&opts);
> +       return ns_capable_common(ns, cap, &opts);
>  }
>  EXPORT_SYMBOL(ns_capable);
>
> @@ -412,7 +423,11 @@ EXPORT_SYMBOL(ns_capable);
>   */
>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, false);
> +       struct security_capable_opts opts;
> +
> +       init_security_capable_opts(&opts);
> +       opts.log_audit_message = false;
> +       return ns_capable_common(ns, cap, &opts);
>  }
>  EXPORT_SYMBOL(ns_capable_noaudit);
>
> @@ -448,10 +463,13 @@ EXPORT_SYMBOL(capable);
>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>                      int cap)
>  {
> +       struct security_capable_opts opts;
> +
>         if (WARN_ON_ONCE(!cap_valid(cap)))
>                 return false;
>
> -       if (security_capable(file->f_cred, ns, cap) == 0)
> +       init_security_capable_opts(&opts);
> +       if (security_capable(file->f_cred, ns, cap, &opts) == 0)
>                 return true;
>
>         return false;
> @@ -500,10 +518,15 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>  {
>         int ret = 0;  /* An absent tracer adds no restrictions */
>         const struct cred *cred;
> +       struct security_capable_opts opts;
> +
> +       init_security_capable_opts(&opts);
> +       opts.log_audit_message = false;
> +
>         rcu_read_lock();
>         cred = rcu_dereference(tsk->ptracer_cred);
>         if (cred)
> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE, &opts);
>         rcu_read_unlock();
>         return (ret == 0);
>  }
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f2ae2324c232..eed0e34c1bc2 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -370,12 +370,15 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>         struct seccomp_filter *sfilter;
>         int ret;
>         const bool save_orig = IS_ENABLED(CONFIG_CHECKPOINT_RESTORE);
> +       struct security_capable_opts opts;
>
>         if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
>                 return ERR_PTR(-EINVAL);
>
>         BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
>
> +       init_security_capable_opts(&opts);
> +       opts.log_audit_message = false;
>         /*
>          * Installing a seccomp filter requires that the task has
>          * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
> @@ -383,8 +386,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>          * behavior of privileged children.
>          */
>         if (!task_no_new_privs(current) &&
> -           security_capable_noaudit(current_cred(), current_user_ns(),
> -                                    CAP_SYS_ADMIN) != 0)
> +           security_capable(current_cred(), current_user_ns(),
> +                                    CAP_SYS_ADMIN, &opts) != 0)
>                 return ERR_PTR(-EACCES);
>
>         /* Allocate a new seccomp_filter */
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index 42446a216f3b..3be87dfd5e57 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>  }
>
>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> -                           int cap, int audit)
> +                           int cap, struct security_capable_opts *opts)
>  {
>         struct aa_label *label;
>         int error = 0;
>
>         label = aa_get_newest_cred_label(cred);
>         if (!unconfined(label))
> -               error = aa_capable(label, cap, audit);
> +               error = aa_capable(label, cap, opts->log_audit_message);
>         aa_put_label(label);
>
>         return error;
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 18a4fdf6f6eb..93fbb0dd70d6 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -69,7 +69,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>   * kernel's capable() and has_capability() returns 1 for this case.
>   */
>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> -               int cap, int audit)
> +               int cap, struct security_capable_opts *opts)
>  {
>         struct user_namespace *ns = targ_ns;
>
> @@ -223,12 +223,14 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>   */
>  static inline int cap_inh_is_capped(void)
>  {
> +       struct security_capable_opts opts;
>
> +       init_security_capable_opts(&opts);
>         /* they are so limited unless the current task has the CAP_SETPCAP
>          * capability
>          */
>         if (cap_capable(current_cred(), current_cred()->user_ns,
> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> +                       CAP_SETPCAP, &opts) == 0)
>                 return 0;
>         return 1;
>  }
> @@ -1174,6 +1176,7 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>  {
>         const struct cred *old = current_cred();
>         struct cred *new;
> +       struct security_capable_opts opts;
>
>         switch (option) {
>         case PR_CAPBSET_READ:
> @@ -1204,13 +1207,15 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>          * capability-based-privilege environment.
>          */
>         case PR_SET_SECUREBITS:
> +               init_security_capable_opts(&opts);
>                 if ((((old->securebits & SECURE_ALL_LOCKS) >> 1)
>                      & (old->securebits ^ arg2))                        /*[1]*/
>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>                     || (cap_capable(current_cred(),
> -                                   current_cred()->user_ns, CAP_SETPCAP,
> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> +                                   current_cred()->user_ns,
> +                                   CAP_SETPCAP,
> +                                   &opts) != 0)                        /*[4]*/
>                         /*
>                          * [1] no changing of bits that are locked
>                          * [2] no unlocking of locks
> @@ -1304,10 +1309,14 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>  int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>  {
>         int cap_sys_admin = 0;
> +       struct security_capable_opts opts;
>
> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> -                       SECURITY_CAP_NOAUDIT) == 0)
> +       init_security_capable_opts(&opts);
> +       opts.log_audit_message = false;
> +
> +       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN, &opts) == 0)
>                 cap_sys_admin = 1;
> +
>         return cap_sys_admin;
>  }
>
> @@ -1323,10 +1332,12 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>  int cap_mmap_addr(unsigned long addr)
>  {
>         int ret = 0;
> +       struct security_capable_opts opts;
>
> +        init_security_capable_opts(&opts);
>         if (addr < dac_mmap_min_addr) {
> -               ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> -                                 SECURITY_CAP_AUDIT);
> +               ret = cap_capable(current_cred(), &init_user_ns,
> +                                               CAP_SYS_RAWIO, &opts);
>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>                 if (ret == 0)
>                         current->flags |= PF_SUPERPRIV;
> diff --git a/security/security.c b/security/security.c
> index 04d173eb93f6..bbc400a90c34 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>                                 effective, inheritable, permitted);
>  }
>
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                    int cap)
> +int security_capable(const struct cred *cred,
> +                    struct user_namespace *ns,
> +                    int cap,
> +                    struct security_capable_opts *opts)
>  {
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap)
> -{
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>  }
>
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 7ce683259357..ebd36adc8856 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -2316,9 +2316,10 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>   */
>
>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> -                          int cap, int audit)
> +                          int cap, struct security_capable_opts *opts)
>  {
> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> +       return cred_has_capability(cred, cap, opts->log_audit_message,
> +                                                       ns == &init_user_ns);
>  }
>
>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> @@ -3245,11 +3246,13 @@ static int selinux_inode_getattr(const struct path *path)
>  static bool has_cap_mac_admin(bool audit)
>  {
>         const struct cred *cred = current_cred();
> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> +       struct security_capable_opts opts;
>
> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> +       init_security_capable_opts(&opts);
> +       opts.log_audit_message = audit ? true : false;
> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, &opts))
>                 return false;
> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts.log_audit_message, true))
>                 return false;
>         return true;
>  }
> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> index 9a4c0ad46518..eca364b697d7 100644
> --- a/security/smack/smack_access.c
> +++ b/security/smack/smack_access.c
> @@ -639,8 +639,11 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>         struct smack_known *skp = tsp->smk_task;
>         struct smack_known_list_elem *sklep;
>         int rc;
> +       struct security_capable_opts opts;
>
> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> +       init_security_capable_opts(&opts);
> +
> +       rc = cap_capable(cred, &init_user_ns, cap, &opts);
>         if (rc)
>                 return false;
>
> --
> 2.19.1.1215.g8438c0b245-goog
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] [PATCH] LSM: generalize flag passing to security_capable
  2018-12-13 22:29     ` Micah Morton
@ 2018-12-13 23:09       ` Casey Schaufler
  2018-12-14  0:05         ` Micah Morton
  2018-12-18 22:37         ` [PATCH v2] " mortonm
  0 siblings, 2 replies; 88+ messages in thread
From: Casey Schaufler @ 2018-12-13 23:09 UTC (permalink / raw)
  To: Micah Morton, jmorris, serge, Kees Cook, sds, linux-security-module

On 12/13/2018 2:29 PM, Micah Morton wrote:
> Any comments on this patch? If not, could it get merged at some point?

Sorry, up to my ears in dropbears.

> I booted a Linux kernel with the changes compiled in and verified with
> print statements that the code works properly AFAICT.
> On Mon, Nov 19, 2018 at 10:54 AM <mortonm@chromium.org> wrote:
>> From: Micah Morton <mortonm@chromium.org>
>>
>> This patch provides a general mechanism for passing flags to the
>> security_capable LSM hook. It replaces the specific 'audit' flag that is
>> used to tell security_capable whether it should log an audit message for
>> the given capability check. The reason for generalizing this flag
>> passing is so we can add an additional flag that signifies whether
>> security_capable is being called by a setid syscall (which is needed by
>> the proposed SafeSetID LSM). This generalization could also support
>> passing down the inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that
>> authorization could happen on a per-file basis for specific files rather
>> than all or nothing.
>>
>> Signed-off-by: Micah Morton <mortonm@chromium.org>
>> ---
>>
>> Developed against the 'next-general' branch.
>>
>> @Stephen: is this the approach you had in mind for modifying the
>> callers of ns_capable?
>>
>>  include/linux/lsm_hooks.h     |  8 ++++---
>>  include/linux/security.h      | 35 ++++++++++++++++++++----------
>>  kernel/capability.c           | 41 +++++++++++++++++++++++++++--------
>>  kernel/seccomp.c              |  7 ++++--
>>  security/apparmor/lsm.c       |  4 ++--
>>  security/commoncap.c          | 27 ++++++++++++++++-------
>>  security/security.c           | 14 +++++-------
>>  security/selinux/hooks.c      | 13 ++++++-----
>>  security/smack/smack_access.c |  5 ++++-
>>  9 files changed, 103 insertions(+), 51 deletions(-)
>>
>> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
>> index aaeb7fa24dc4..02422592cc83 100644
>> --- a/include/linux/lsm_hooks.h
>> +++ b/include/linux/lsm_hooks.h
>> @@ -1270,7 +1270,7 @@
>>   *     @cred contains the credentials to use.
>>   *     @ns contains the user namespace we want the capability in
>>   *     @cap contains the capability <include/linux/capability.h>.
>> - *     @audit contains whether to write an audit message or not
>> + *     @opts contains options for the capable check <include/linux/security.h>
>>   *     Return 0 if the capability is granted for @tsk.
>>   * @syslog:
>>   *     Check permission before accessing the kernel message ring or changing
>> @@ -1446,8 +1446,10 @@ union security_list_options {
>>                         const kernel_cap_t *effective,
>>                         const kernel_cap_t *inheritable,
>>                         const kernel_cap_t *permitted);
>> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
>> -                       int cap, int audit);
>> +       int (*capable)(const struct cred *cred,
>> +                       struct user_namespace *ns,
>> +                       int cap,
>> +                       struct security_capable_opts *opts);

If you used the existing "audit" argument as a bitmask you wouldn't
have to change the interface or most of the callers.

>>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>>         int (*quota_on)(struct dentry *dentry);
>>         int (*syslog)(int type);
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index d170a5b031f3..b60621e93faf 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -58,6 +58,13 @@ struct mm_struct;
>>  #define SECURITY_CAP_NOAUDIT 0
>>  #define SECURITY_CAP_AUDIT 1
>>
>> +struct security_capable_opts {
>> +       /* If capable should audit the security request */
>> +       bool log_audit_message;
>> +       /* If capable is being called from a setid syscall */
>> +       bool in_setid;
>> +};
>> +

Why not
	#define SECURITY_CAP_AUDIT 0x01
	#define SECURITY_CAP_INSETID 0x02

>>  /* LSM Agnostic defines for sb_set_mnt_opts */
>>  #define SECURITY_LSM_NATIVE_LABELS     1
>>
>> @@ -72,7 +79,7 @@ enum lsm_event {
>>
>>  /* These functions are in security/commoncap.c */
>>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
>> -                      int cap, int audit);
>> +                      int cap, struct security_capable_opts *opts);

Unnecessary if you use a bitmask.

>>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>>  extern int cap_ptrace_traceme(struct task_struct *parent);
>> @@ -180,6 +187,13 @@ static inline const char *kernel_load_data_id_str(enum kernel_load_data_id id)
>>         return kernel_load_data_str[id];
>>  }
>>
>> +/* init a security_capable_opts struct with default values */
>> +static inline void init_security_capable_opts(struct security_capable_opts* opts)
>> +{
>> +       opts->log_audit_message = true;
>> +       opts->in_setid = false;
>> +}
>> +

Also unnecessary.

>>  #ifdef CONFIG_SECURITY
>>
>>  struct security_mnt_opts {
>> @@ -233,10 +247,10 @@ int security_capset(struct cred *new, const struct cred *old,
>>                     const kernel_cap_t *effective,
>>                     const kernel_cap_t *inheritable,
>>                     const kernel_cap_t *permitted);
>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
>> -                       int cap);
>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
>> -                            int cap);
>> +int security_capable(const struct cred *cred,
>> +                      struct user_namespace *ns,
>> +                      int cap,
>> +                      struct security_capable_opts *opts);

Bitmask.

>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>>  int security_quota_on(struct dentry *dentry);
>>  int security_syslog(int type);
>> @@ -492,14 +506,11 @@ static inline int security_capset(struct cred *new,
>>  }
>>
>>  static inline int security_capable(const struct cred *cred,
>> -                                  struct user_namespace *ns, int cap)
>> +                                  struct user_namespace *ns,
>> +                                  int cap,
>> +                                  struct security_capable_opts *opts)
>>  {
>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
>> -}
>> -
>> -static inline int security_capable_noaudit(const struct cred *cred,
>> -                                          struct user_namespace *ns, int cap) {
>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
>> +       return cap_capable(cred, ns, cap, opts);
>>  }
>>
>>  static inline int security_quotactl(int cmds, int type, int id,
>> diff --git a/kernel/capability.c b/kernel/capability.c
>> index 1e1c0236f55b..d8ff27e6e7c4 100644
>> --- a/kernel/capability.c
>> +++ b/kernel/capability.c
>> @@ -297,9 +297,12 @@ bool has_ns_capability(struct task_struct *t,
>>                        struct user_namespace *ns, int cap)
>>  {
>>         int ret;
>> +       struct security_capable_opts opts;
>> +
>> +       init_security_capable_opts(&opts);
>>
>>         rcu_read_lock();
>> -       ret = security_capable(__task_cred(t), ns, cap);
>> +       ret = security_capable(__task_cred(t), ns, cap, &opts);
>>         rcu_read_unlock();
>>
>>         return (ret == 0);
>> @@ -338,9 +341,13 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>>                                struct user_namespace *ns, int cap)
>>  {
>>         int ret;
>> +       struct security_capable_opts opts;
>> +
>> +       init_security_capable_opts(&opts);
>> +       opts.log_audit_message = false;

This is why I would prefer a bitmask. Too much work
for the desired result.

>>
>>         rcu_read_lock();
>> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
>> +       ret = security_capable(__task_cred(t), ns, cap, &opts);
>>         rcu_read_unlock();
>>
>>         return (ret == 0);
>> @@ -363,7 +370,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>>  }
>>
>> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>> +static bool ns_capable_common(struct user_namespace *ns,
>> +                             int cap,
>> +                             struct security_capable_opts *opts)
>>  {
>>         int capable;
>>
>> @@ -372,8 +381,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>                 BUG();
>>         }
>>
>> -       capable = audit ? security_capable(current_cred(), ns, cap) :
>> -                         security_capable_noaudit(current_cred(), ns, cap);
>> +       capable = security_capable(current_cred(), ns, cap, opts);
>>         if (capable == 0) {
>>                 current->flags |= PF_SUPERPRIV;
>>                 return true;
>> @@ -394,7 +402,10 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>   */
>>  bool ns_capable(struct user_namespace *ns, int cap)
>>  {
>> -       return ns_capable_common(ns, cap, true);
>> +       struct security_capable_opts opts;
>> +
>> +       init_security_capable_opts(&opts);
>> +       return ns_capable_common(ns, cap, &opts);
>>  }
>>  EXPORT_SYMBOL(ns_capable);
>>
>> @@ -412,7 +423,11 @@ EXPORT_SYMBOL(ns_capable);
>>   */
>>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>>  {
>> -       return ns_capable_common(ns, cap, false);
>> +       struct security_capable_opts opts;
>> +
>> +       init_security_capable_opts(&opts);
>> +       opts.log_audit_message = false;
>> +       return ns_capable_common(ns, cap, &opts);
>>  }
>>  EXPORT_SYMBOL(ns_capable_noaudit);
>>
>> @@ -448,10 +463,13 @@ EXPORT_SYMBOL(capable);
>>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>>                      int cap)
>>  {
>> +       struct security_capable_opts opts;
>> +
>>         if (WARN_ON_ONCE(!cap_valid(cap)))
>>                 return false;
>>
>> -       if (security_capable(file->f_cred, ns, cap) == 0)
>> +       init_security_capable_opts(&opts);
>> +       if (security_capable(file->f_cred, ns, cap, &opts) == 0)
>>                 return true;
>>
>>         return false;
>> @@ -500,10 +518,15 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>>  {
>>         int ret = 0;  /* An absent tracer adds no restrictions */
>>         const struct cred *cred;
>> +       struct security_capable_opts opts;
>> +
>> +       init_security_capable_opts(&opts);
>> +       opts.log_audit_message = false;
>> +
>>         rcu_read_lock();
>>         cred = rcu_dereference(tsk->ptracer_cred);
>>         if (cred)
>> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
>> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE, &opts);
>>         rcu_read_unlock();
>>         return (ret == 0);
>>  }
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index f2ae2324c232..eed0e34c1bc2 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -370,12 +370,15 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>         struct seccomp_filter *sfilter;
>>         int ret;
>>         const bool save_orig = IS_ENABLED(CONFIG_CHECKPOINT_RESTORE);
>> +       struct security_capable_opts opts;
>>
>>         if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
>>                 return ERR_PTR(-EINVAL);
>>
>>         BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
>>
>> +       init_security_capable_opts(&opts);
>> +       opts.log_audit_message = false;
>>         /*
>>          * Installing a seccomp filter requires that the task has
>>          * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
>> @@ -383,8 +386,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>          * behavior of privileged children.
>>          */
>>         if (!task_no_new_privs(current) &&
>> -           security_capable_noaudit(current_cred(), current_user_ns(),
>> -                                    CAP_SYS_ADMIN) != 0)
>> +           security_capable(current_cred(), current_user_ns(),
>> +                                    CAP_SYS_ADMIN, &opts) != 0)
>>                 return ERR_PTR(-EACCES);
>>
>>         /* Allocate a new seccomp_filter */
>> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
>> index 42446a216f3b..3be87dfd5e57 100644
>> --- a/security/apparmor/lsm.c
>> +++ b/security/apparmor/lsm.c
>> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>>  }
>>
>>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
>> -                           int cap, int audit)
>> +                           int cap, struct security_capable_opts *opts)
>>  {
>>         struct aa_label *label;
>>         int error = 0;
>>
>>         label = aa_get_newest_cred_label(cred);
>>         if (!unconfined(label))
>> -               error = aa_capable(label, cap, audit);
>> +               error = aa_capable(label, cap, opts->log_audit_message);
>>         aa_put_label(label);
>>
>>         return error;
>> diff --git a/security/commoncap.c b/security/commoncap.c
>> index 18a4fdf6f6eb..93fbb0dd70d6 100644
>> --- a/security/commoncap.c
>> +++ b/security/commoncap.c
>> @@ -69,7 +69,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>>   * kernel's capable() and has_capability() returns 1 for this case.
>>   */
>>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
>> -               int cap, int audit)
>> +               int cap, struct security_capable_opts *opts)
>>  {
>>         struct user_namespace *ns = targ_ns;
>>
>> @@ -223,12 +223,14 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>>   */
>>  static inline int cap_inh_is_capped(void)
>>  {
>> +       struct security_capable_opts opts;
>>
>> +       init_security_capable_opts(&opts);
>>         /* they are so limited unless the current task has the CAP_SETPCAP
>>          * capability
>>          */
>>         if (cap_capable(current_cred(), current_cred()->user_ns,
>> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
>> +                       CAP_SETPCAP, &opts) == 0)
>>                 return 0;
>>         return 1;
>>  }
>> @@ -1174,6 +1176,7 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>>  {
>>         const struct cred *old = current_cred();
>>         struct cred *new;
>> +       struct security_capable_opts opts;
>>
>>         switch (option) {
>>         case PR_CAPBSET_READ:
>> @@ -1204,13 +1207,15 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>>          * capability-based-privilege environment.
>>          */
>>         case PR_SET_SECUREBITS:
>> +               init_security_capable_opts(&opts);
>>                 if ((((old->securebits & SECURE_ALL_LOCKS) >> 1)
>>                      & (old->securebits ^ arg2))                        /*[1]*/
>>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>>                     || (cap_capable(current_cred(),
>> -                                   current_cred()->user_ns, CAP_SETPCAP,
>> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
>> +                                   current_cred()->user_ns,
>> +                                   CAP_SETPCAP,
>> +                                   &opts) != 0)                        /*[4]*/
>>                         /*
>>                          * [1] no changing of bits that are locked
>>                          * [2] no unlocking of locks
>> @@ -1304,10 +1309,14 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>>  int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>>  {
>>         int cap_sys_admin = 0;
>> +       struct security_capable_opts opts;
>>
>> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
>> -                       SECURITY_CAP_NOAUDIT) == 0)
>> +       init_security_capable_opts(&opts);
>> +       opts.log_audit_message = false;
>> +
>> +       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN, &opts) == 0)
>>                 cap_sys_admin = 1;
>> +
>>         return cap_sys_admin;
>>  }
>>
>> @@ -1323,10 +1332,12 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>>  int cap_mmap_addr(unsigned long addr)
>>  {
>>         int ret = 0;
>> +       struct security_capable_opts opts;
>>
>> +        init_security_capable_opts(&opts);
>>         if (addr < dac_mmap_min_addr) {
>> -               ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
>> -                                 SECURITY_CAP_AUDIT);
>> +               ret = cap_capable(current_cred(), &init_user_ns,
>> +                                               CAP_SYS_RAWIO, &opts);
>>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>>                 if (ret == 0)
>>                         current->flags |= PF_SUPERPRIV;
>> diff --git a/security/security.c b/security/security.c
>> index 04d173eb93f6..bbc400a90c34 100644
>> --- a/security/security.c
>> +++ b/security/security.c
>> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>>                                 effective, inheritable, permitted);
>>  }
>>
>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
>> -                    int cap)
>> +int security_capable(const struct cred *cred,
>> +                    struct user_namespace *ns,
>> +                    int cap,
>> +                    struct security_capable_opts *opts)
>>  {
>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
>> -}
>> -
>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
>> -                            int cap)
>> -{
>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
>> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>>  }
>>
>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
>> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
>> index 7ce683259357..ebd36adc8856 100644
>> --- a/security/selinux/hooks.c
>> +++ b/security/selinux/hooks.c
>> @@ -2316,9 +2316,10 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>>   */
>>
>>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
>> -                          int cap, int audit)
>> +                          int cap, struct security_capable_opts *opts)
>>  {
>> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
>> +       return cred_has_capability(cred, cap, opts->log_audit_message,
>> +                                                       ns == &init_user_ns);
>>  }
>>
>>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
>> @@ -3245,11 +3246,13 @@ static int selinux_inode_getattr(const struct path *path)
>>  static bool has_cap_mac_admin(bool audit)
>>  {
>>         const struct cred *cred = current_cred();
>> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
>> +       struct security_capable_opts opts;
>>
>> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
>> +       init_security_capable_opts(&opts);
>> +       opts.log_audit_message = audit ? true : false;
>> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, &opts))
>>                 return false;
>> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
>> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts.log_audit_message, true))
>>                 return false;
>>         return true;
>>  }
>> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
>> index 9a4c0ad46518..eca364b697d7 100644
>> --- a/security/smack/smack_access.c
>> +++ b/security/smack/smack_access.c
>> @@ -639,8 +639,11 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>>         struct smack_known *skp = tsp->smk_task;
>>         struct smack_known_list_elem *sklep;
>>         int rc;
>> +       struct security_capable_opts opts;
>>
>> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
>> +       init_security_capable_opts(&opts);
>> +
>> +       rc = cap_capable(cred, &init_user_ns, cap, &opts);
>>         if (rc)
>>                 return false;
>>
>> --
>> 2.19.1.1215.g8438c0b245-goog
>>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] [PATCH] LSM: generalize flag passing to security_capable
  2018-12-13 23:09       ` Casey Schaufler
@ 2018-12-14  0:05         ` Micah Morton
  2018-12-18 22:37         ` [PATCH v2] " mortonm
  1 sibling, 0 replies; 88+ messages in thread
From: Micah Morton @ 2018-12-14  0:05 UTC (permalink / raw)
  To: casey; +Cc: jmorris, serge, Kees Cook, sds, linux-security-module

I agree the bitmask is simpler for what we have so far. Only reason I
was doing the struct was that Stephen mentioned possibly using this to
pass down inodes to the capable() check for certain code paths. Maybe
we should give him a chance to respond and see if he thinks this would
be useful for that. Otherwise I'll upload the bitmask version of this
patch on Monday.
On Thu, Dec 13, 2018 at 3:09 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 12/13/2018 2:29 PM, Micah Morton wrote:
> > Any comments on this patch? If not, could it get merged at some point?
>
> Sorry, up to my ears in dropbears.
>
> > I booted a Linux kernel with the changes compiled in and verified with
> > print statements that the code works properly AFAICT.
> > On Mon, Nov 19, 2018 at 10:54 AM <mortonm@chromium.org> wrote:
> >> From: Micah Morton <mortonm@chromium.org>
> >>
> >> This patch provides a general mechanism for passing flags to the
> >> security_capable LSM hook. It replaces the specific 'audit' flag that is
> >> used to tell security_capable whether it should log an audit message for
> >> the given capability check. The reason for generalizing this flag
> >> passing is so we can add an additional flag that signifies whether
> >> security_capable is being called by a setid syscall (which is needed by
> >> the proposed SafeSetID LSM). This generalization could also support
> >> passing down the inode for CAP_DAC_OVERRIDE/READ_SEARCH checks so that
> >> authorization could happen on a per-file basis for specific files rather
> >> than all or nothing.
> >>
> >> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >> ---
> >>
> >> Developed against the 'next-general' branch.
> >>
> >> @Stephen: is this the approach you had in mind for modifying the
> >> callers of ns_capable?
> >>
> >>  include/linux/lsm_hooks.h     |  8 ++++---
> >>  include/linux/security.h      | 35 ++++++++++++++++++++----------
> >>  kernel/capability.c           | 41 +++++++++++++++++++++++++++--------
> >>  kernel/seccomp.c              |  7 ++++--
> >>  security/apparmor/lsm.c       |  4 ++--
> >>  security/commoncap.c          | 27 ++++++++++++++++-------
> >>  security/security.c           | 14 +++++-------
> >>  security/selinux/hooks.c      | 13 ++++++-----
> >>  security/smack/smack_access.c |  5 ++++-
> >>  9 files changed, 103 insertions(+), 51 deletions(-)
> >>
> >> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> >> index aaeb7fa24dc4..02422592cc83 100644
> >> --- a/include/linux/lsm_hooks.h
> >> +++ b/include/linux/lsm_hooks.h
> >> @@ -1270,7 +1270,7 @@
> >>   *     @cred contains the credentials to use.
> >>   *     @ns contains the user namespace we want the capability in
> >>   *     @cap contains the capability <include/linux/capability.h>.
> >> - *     @audit contains whether to write an audit message or not
> >> + *     @opts contains options for the capable check <include/linux/security.h>
> >>   *     Return 0 if the capability is granted for @tsk.
> >>   * @syslog:
> >>   *     Check permission before accessing the kernel message ring or changing
> >> @@ -1446,8 +1446,10 @@ union security_list_options {
> >>                         const kernel_cap_t *effective,
> >>                         const kernel_cap_t *inheritable,
> >>                         const kernel_cap_t *permitted);
> >> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> >> -                       int cap, int audit);
> >> +       int (*capable)(const struct cred *cred,
> >> +                       struct user_namespace *ns,
> >> +                       int cap,
> >> +                       struct security_capable_opts *opts);
>
> If you used the existing "audit" argument as a bitmask you wouldn't
> have to change the interface or most of the callers.
>
> >>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
> >>         int (*quota_on)(struct dentry *dentry);
> >>         int (*syslog)(int type);
> >> diff --git a/include/linux/security.h b/include/linux/security.h
> >> index d170a5b031f3..b60621e93faf 100644
> >> --- a/include/linux/security.h
> >> +++ b/include/linux/security.h
> >> @@ -58,6 +58,13 @@ struct mm_struct;
> >>  #define SECURITY_CAP_NOAUDIT 0
> >>  #define SECURITY_CAP_AUDIT 1
> >>
> >> +struct security_capable_opts {
> >> +       /* If capable should audit the security request */
> >> +       bool log_audit_message;
> >> +       /* If capable is being called from a setid syscall */
> >> +       bool in_setid;
> >> +};
> >> +
>
> Why not
>         #define SECURITY_CAP_AUDIT 0x01
>         #define SECURITY_CAP_INSETID 0x02
>
> >>  /* LSM Agnostic defines for sb_set_mnt_opts */
> >>  #define SECURITY_LSM_NATIVE_LABELS     1
> >>
> >> @@ -72,7 +79,7 @@ enum lsm_event {
> >>
> >>  /* These functions are in security/commoncap.c */
> >>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                      int cap, int audit);
> >> +                      int cap, struct security_capable_opts *opts);
>
> Unnecessary if you use a bitmask.
>
> >>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
> >>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
> >>  extern int cap_ptrace_traceme(struct task_struct *parent);
> >> @@ -180,6 +187,13 @@ static inline const char *kernel_load_data_id_str(enum kernel_load_data_id id)
> >>         return kernel_load_data_str[id];
> >>  }
> >>
> >> +/* init a security_capable_opts struct with default values */
> >> +static inline void init_security_capable_opts(struct security_capable_opts* opts)
> >> +{
> >> +       opts->log_audit_message = true;
> >> +       opts->in_setid = false;
> >> +}
> >> +
>
> Also unnecessary.
>
> >>  #ifdef CONFIG_SECURITY
> >>
> >>  struct security_mnt_opts {
> >> @@ -233,10 +247,10 @@ int security_capset(struct cred *new, const struct cred *old,
> >>                     const kernel_cap_t *effective,
> >>                     const kernel_cap_t *inheritable,
> >>                     const kernel_cap_t *permitted);
> >> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                       int cap);
> >> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> >> -                            int cap);
> >> +int security_capable(const struct cred *cred,
> >> +                      struct user_namespace *ns,
> >> +                      int cap,
> >> +                      struct security_capable_opts *opts);
>
> Bitmask.
>
> >>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
> >>  int security_quota_on(struct dentry *dentry);
> >>  int security_syslog(int type);
> >> @@ -492,14 +506,11 @@ static inline int security_capset(struct cred *new,
> >>  }
> >>
> >>  static inline int security_capable(const struct cred *cred,
> >> -                                  struct user_namespace *ns, int cap)
> >> +                                  struct user_namespace *ns,
> >> +                                  int cap,
> >> +                                  struct security_capable_opts *opts)
> >>  {
> >> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> >> -}
> >> -
> >> -static inline int security_capable_noaudit(const struct cred *cred,
> >> -                                          struct user_namespace *ns, int cap) {
> >> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> >> +       return cap_capable(cred, ns, cap, opts);
> >>  }
> >>
> >>  static inline int security_quotactl(int cmds, int type, int id,
> >> diff --git a/kernel/capability.c b/kernel/capability.c
> >> index 1e1c0236f55b..d8ff27e6e7c4 100644
> >> --- a/kernel/capability.c
> >> +++ b/kernel/capability.c
> >> @@ -297,9 +297,12 @@ bool has_ns_capability(struct task_struct *t,
> >>                        struct user_namespace *ns, int cap)
> >>  {
> >>         int ret;
> >> +       struct security_capable_opts opts;
> >> +
> >> +       init_security_capable_opts(&opts);
> >>
> >>         rcu_read_lock();
> >> -       ret = security_capable(__task_cred(t), ns, cap);
> >> +       ret = security_capable(__task_cred(t), ns, cap, &opts);
> >>         rcu_read_unlock();
> >>
> >>         return (ret == 0);
> >> @@ -338,9 +341,13 @@ bool has_ns_capability_noaudit(struct task_struct *t,
> >>                                struct user_namespace *ns, int cap)
> >>  {
> >>         int ret;
> >> +       struct security_capable_opts opts;
> >> +
> >> +       init_security_capable_opts(&opts);
> >> +       opts.log_audit_message = false;
>
> This is why I would prefer a bitmask. Too much work
> for the desired result.
>
> >>
> >>         rcu_read_lock();
> >> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> >> +       ret = security_capable(__task_cred(t), ns, cap, &opts);
> >>         rcu_read_unlock();
> >>
> >>         return (ret == 0);
> >> @@ -363,7 +370,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
> >>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
> >>  }
> >>
> >> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >> +static bool ns_capable_common(struct user_namespace *ns,
> >> +                             int cap,
> >> +                             struct security_capable_opts *opts)
> >>  {
> >>         int capable;
> >>
> >> @@ -372,8 +381,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>                 BUG();
> >>         }
> >>
> >> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> >> -                         security_capable_noaudit(current_cred(), ns, cap);
> >> +       capable = security_capable(current_cred(), ns, cap, opts);
> >>         if (capable == 0) {
> >>                 current->flags |= PF_SUPERPRIV;
> >>                 return true;
> >> @@ -394,7 +402,10 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>   */
> >>  bool ns_capable(struct user_namespace *ns, int cap)
> >>  {
> >> -       return ns_capable_common(ns, cap, true);
> >> +       struct security_capable_opts opts;
> >> +
> >> +       init_security_capable_opts(&opts);
> >> +       return ns_capable_common(ns, cap, &opts);
> >>  }
> >>  EXPORT_SYMBOL(ns_capable);
> >>
> >> @@ -412,7 +423,11 @@ EXPORT_SYMBOL(ns_capable);
> >>   */
> >>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> >>  {
> >> -       return ns_capable_common(ns, cap, false);
> >> +       struct security_capable_opts opts;
> >> +
> >> +       init_security_capable_opts(&opts);
> >> +       opts.log_audit_message = false;
> >> +       return ns_capable_common(ns, cap, &opts);
> >>  }
> >>  EXPORT_SYMBOL(ns_capable_noaudit);
> >>
> >> @@ -448,10 +463,13 @@ EXPORT_SYMBOL(capable);
> >>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
> >>                      int cap)
> >>  {
> >> +       struct security_capable_opts opts;
> >> +
> >>         if (WARN_ON_ONCE(!cap_valid(cap)))
> >>                 return false;
> >>
> >> -       if (security_capable(file->f_cred, ns, cap) == 0)
> >> +       init_security_capable_opts(&opts);
> >> +       if (security_capable(file->f_cred, ns, cap, &opts) == 0)
> >>                 return true;
> >>
> >>         return false;
> >> @@ -500,10 +518,15 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
> >>  {
> >>         int ret = 0;  /* An absent tracer adds no restrictions */
> >>         const struct cred *cred;
> >> +       struct security_capable_opts opts;
> >> +
> >> +       init_security_capable_opts(&opts);
> >> +       opts.log_audit_message = false;
> >> +
> >>         rcu_read_lock();
> >>         cred = rcu_dereference(tsk->ptracer_cred);
> >>         if (cred)
> >> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> >> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE, &opts);
> >>         rcu_read_unlock();
> >>         return (ret == 0);
> >>  }
> >> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> >> index f2ae2324c232..eed0e34c1bc2 100644
> >> --- a/kernel/seccomp.c
> >> +++ b/kernel/seccomp.c
> >> @@ -370,12 +370,15 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >>         struct seccomp_filter *sfilter;
> >>         int ret;
> >>         const bool save_orig = IS_ENABLED(CONFIG_CHECKPOINT_RESTORE);
> >> +       struct security_capable_opts opts;
> >>
> >>         if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
> >>                 return ERR_PTR(-EINVAL);
> >>
> >>         BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
> >>
> >> +       init_security_capable_opts(&opts);
> >> +       opts.log_audit_message = false;
> >>         /*
> >>          * Installing a seccomp filter requires that the task has
> >>          * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
> >> @@ -383,8 +386,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >>          * behavior of privileged children.
> >>          */
> >>         if (!task_no_new_privs(current) &&
> >> -           security_capable_noaudit(current_cred(), current_user_ns(),
> >> -                                    CAP_SYS_ADMIN) != 0)
> >> +           security_capable(current_cred(), current_user_ns(),
> >> +                                    CAP_SYS_ADMIN, &opts) != 0)
> >>                 return ERR_PTR(-EACCES);
> >>
> >>         /* Allocate a new seccomp_filter */
> >> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> >> index 42446a216f3b..3be87dfd5e57 100644
> >> --- a/security/apparmor/lsm.c
> >> +++ b/security/apparmor/lsm.c
> >> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
> >>  }
> >>
> >>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                           int cap, int audit)
> >> +                           int cap, struct security_capable_opts *opts)
> >>  {
> >>         struct aa_label *label;
> >>         int error = 0;
> >>
> >>         label = aa_get_newest_cred_label(cred);
> >>         if (!unconfined(label))
> >> -               error = aa_capable(label, cap, audit);
> >> +               error = aa_capable(label, cap, opts->log_audit_message);
> >>         aa_put_label(label);
> >>
> >>         return error;
> >> diff --git a/security/commoncap.c b/security/commoncap.c
> >> index 18a4fdf6f6eb..93fbb0dd70d6 100644
> >> --- a/security/commoncap.c
> >> +++ b/security/commoncap.c
> >> @@ -69,7 +69,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
> >>   * kernel's capable() and has_capability() returns 1 for this case.
> >>   */
> >>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> >> -               int cap, int audit)
> >> +               int cap, struct security_capable_opts *opts)
> >>  {
> >>         struct user_namespace *ns = targ_ns;
> >>
> >> @@ -223,12 +223,14 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
> >>   */
> >>  static inline int cap_inh_is_capped(void)
> >>  {
> >> +       struct security_capable_opts opts;
> >>
> >> +       init_security_capable_opts(&opts);
> >>         /* they are so limited unless the current task has the CAP_SETPCAP
> >>          * capability
> >>          */
> >>         if (cap_capable(current_cred(), current_cred()->user_ns,
> >> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> >> +                       CAP_SETPCAP, &opts) == 0)
> >>                 return 0;
> >>         return 1;
> >>  }
> >> @@ -1174,6 +1176,7 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >>  {
> >>         const struct cred *old = current_cred();
> >>         struct cred *new;
> >> +       struct security_capable_opts opts;
> >>
> >>         switch (option) {
> >>         case PR_CAPBSET_READ:
> >> @@ -1204,13 +1207,15 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >>          * capability-based-privilege environment.
> >>          */
> >>         case PR_SET_SECUREBITS:
> >> +               init_security_capable_opts(&opts);
> >>                 if ((((old->securebits & SECURE_ALL_LOCKS) >> 1)
> >>                      & (old->securebits ^ arg2))                        /*[1]*/
> >>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
> >>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
> >>                     || (cap_capable(current_cred(),
> >> -                                   current_cred()->user_ns, CAP_SETPCAP,
> >> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> >> +                                   current_cred()->user_ns,
> >> +                                   CAP_SETPCAP,
> >> +                                   &opts) != 0)                        /*[4]*/
> >>                         /*
> >>                          * [1] no changing of bits that are locked
> >>                          * [2] no unlocking of locks
> >> @@ -1304,10 +1309,14 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >>  int cap_vm_enough_memory(struct mm_struct *mm, long pages)
> >>  {
> >>         int cap_sys_admin = 0;
> >> +       struct security_capable_opts opts;
> >>
> >> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> >> -                       SECURITY_CAP_NOAUDIT) == 0)
> >> +       init_security_capable_opts(&opts);
> >> +       opts.log_audit_message = false;
> >> +
> >> +       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN, &opts) == 0)
> >>                 cap_sys_admin = 1;
> >> +
> >>         return cap_sys_admin;
> >>  }
> >>
> >> @@ -1323,10 +1332,12 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
> >>  int cap_mmap_addr(unsigned long addr)
> >>  {
> >>         int ret = 0;
> >> +       struct security_capable_opts opts;
> >>
> >> +        init_security_capable_opts(&opts);
> >>         if (addr < dac_mmap_min_addr) {
> >> -               ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> >> -                                 SECURITY_CAP_AUDIT);
> >> +               ret = cap_capable(current_cred(), &init_user_ns,
> >> +                                               CAP_SYS_RAWIO, &opts);
> >>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
> >>                 if (ret == 0)
> >>                         current->flags |= PF_SUPERPRIV;
> >> diff --git a/security/security.c b/security/security.c
> >> index 04d173eb93f6..bbc400a90c34 100644
> >> --- a/security/security.c
> >> +++ b/security/security.c
> >> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
> >>                                 effective, inheritable, permitted);
> >>  }
> >>
> >> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                    int cap)
> >> +int security_capable(const struct cred *cred,
> >> +                    struct user_namespace *ns,
> >> +                    int cap,
> >> +                    struct security_capable_opts *opts)
> >>  {
> >> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> >> -}
> >> -
> >> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> >> -                            int cap)
> >> -{
> >> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> >> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
> >>  }
> >>
> >>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> >> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> >> index 7ce683259357..ebd36adc8856 100644
> >> --- a/security/selinux/hooks.c
> >> +++ b/security/selinux/hooks.c
> >> @@ -2316,9 +2316,10 @@ static int selinux_capset(struct cred *new, const struct cred *old,
> >>   */
> >>
> >>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                          int cap, int audit)
> >> +                          int cap, struct security_capable_opts *opts)
> >>  {
> >> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> >> +       return cred_has_capability(cred, cap, opts->log_audit_message,
> >> +                                                       ns == &init_user_ns);
> >>  }
> >>
> >>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> >> @@ -3245,11 +3246,13 @@ static int selinux_inode_getattr(const struct path *path)
> >>  static bool has_cap_mac_admin(bool audit)
> >>  {
> >>         const struct cred *cred = current_cred();
> >> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> >> +       struct security_capable_opts opts;
> >>
> >> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> >> +       init_security_capable_opts(&opts);
> >> +       opts.log_audit_message = audit ? true : false;
> >> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, &opts))
> >>                 return false;
> >> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> >> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts.log_audit_message, true))
> >>                 return false;
> >>         return true;
> >>  }
> >> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> >> index 9a4c0ad46518..eca364b697d7 100644
> >> --- a/security/smack/smack_access.c
> >> +++ b/security/smack/smack_access.c
> >> @@ -639,8 +639,11 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
> >>         struct smack_known *skp = tsp->smk_task;
> >>         struct smack_known_list_elem *sklep;
> >>         int rc;
> >> +       struct security_capable_opts opts;
> >>
> >> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> >> +       init_security_capable_opts(&opts);
> >> +
> >> +       rc = cap_capable(cred, &init_user_ns, cap, &opts);
> >>         if (rc)
> >>                 return false;
> >>
> >> --
> >> 2.19.1.1215.g8438c0b245-goog
> >>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2] LSM: generalize flag passing to security_capable
  2018-12-13 23:09       ` Casey Schaufler
  2018-12-14  0:05         ` Micah Morton
@ 2018-12-18 22:37         ` mortonm
  2019-01-07 17:55           ` Micah Morton
  2019-01-07 23:13           ` [PATCH v2] " Kees Cook
  1 sibling, 2 replies; 88+ messages in thread
From: mortonm @ 2018-12-18 22:37 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM).

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Changes since the last patch: Changed the code to use a bitmask instead
of a struct to represent the options passed to security_capable.

 include/linux/lsm_hooks.h              |  8 +++++---
 include/linux/security.h               | 28 +++++++++++++-------------
 kernel/capability.c                    | 22 +++++++++++---------
 kernel/seccomp.c                       |  4 ++--
 security/apparmor/capability.c         | 14 ++++++-------
 security/apparmor/include/capability.h |  2 +-
 security/apparmor/ipc.c                |  3 ++-
 security/apparmor/lsm.c                |  4 ++--
 security/commoncap.c                   | 17 ++++++++--------
 security/security.c                    | 14 +++++--------
 security/selinux/hooks.c               | 16 +++++++--------
 security/smack/smack_access.c          |  2 +-
 12 files changed, 69 insertions(+), 65 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index aaeb7fa24dc4..ef955a44a782 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1270,7 +1270,7 @@
  *	@cred contains the credentials to use.
  *	@ns contains the user namespace we want the capability in
  *	@cap contains the capability <include/linux/capability.h>.
- *	@audit contains whether to write an audit message or not
+ *	@opts contains options for the capable check <include/linux/security.h>
  *	Return 0 if the capability is granted for @tsk.
  * @syslog:
  *	Check permission before accessing the kernel message ring or changing
@@ -1446,8 +1446,10 @@ union security_list_options {
 			const kernel_cap_t *effective,
 			const kernel_cap_t *inheritable,
 			const kernel_cap_t *permitted);
-	int (*capable)(const struct cred *cred, struct user_namespace *ns,
-			int cap, int audit);
+	int (*capable)(const struct cred *cred,
+			struct user_namespace *ns,
+			int cap,
+			unsigned int opts);
 	int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
 	int (*quota_on)(struct dentry *dentry);
 	int (*syslog)(int type);
diff --git a/include/linux/security.h b/include/linux/security.h
index d170a5b031f3..038e6779948c 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -54,9 +54,12 @@ struct xattr;
 struct xfrm_sec_ctx;
 struct mm_struct;
 
+/* Default (no) options for the capable function */
+#define SECURITY_CAP_DEFAULT 0x0
 /* If capable should audit the security request */
-#define SECURITY_CAP_NOAUDIT 0
-#define SECURITY_CAP_AUDIT 1
+#define SECURITY_CAP_NOAUDIT 0x01
+/* If capable is being called by a setid function */
+#define SECURITY_CAP_INSETID 0x02
 
 /* LSM Agnostic defines for sb_set_mnt_opts */
 #define SECURITY_LSM_NATIVE_LABELS	1
@@ -72,7 +75,7 @@ enum lsm_event {
 
 /* These functions are in security/commoncap.c */
 extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
-		       int cap, int audit);
+		       int cap, unsigned int opts);
 extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
 extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
@@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
 		    const kernel_cap_t *effective,
 		    const kernel_cap_t *inheritable,
 		    const kernel_cap_t *permitted);
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-			int cap);
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap);
+int security_capable(const struct cred *cred,
+		       struct user_namespace *ns,
+		       int cap,
+		       unsigned int opts);
 int security_quotactl(int cmds, int type, int id, struct super_block *sb);
 int security_quota_on(struct dentry *dentry);
 int security_syslog(int type);
@@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
 }
 
 static inline int security_capable(const struct cred *cred,
-				   struct user_namespace *ns, int cap)
+				   struct user_namespace *ns,
+				   int cap,
+				   unsigned int opts)
 {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-static inline int security_capable_noaudit(const struct cred *cred,
-					   struct user_namespace *ns, int cap) {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return cap_capable(cred, ns, cap, opts);
 }
 
 static inline int security_quotactl(int cmds, int type, int id,
diff --git a/kernel/capability.c b/kernel/capability.c
index 1e1c0236f55b..454576743b1b 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
 	int ret;
 
 	rcu_read_lock();
-	ret = security_capable(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
 	int ret;
 
 	rcu_read_lock();
-	ret = security_capable_noaudit(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
 	return has_ns_capability_noaudit(t, &init_user_ns, cap);
 }
 
-static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
+static bool ns_capable_common(struct user_namespace *ns,
+			      int cap,
+			      unsigned int opts)
 {
 	int capable;
 
@@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
 		BUG();
 	}
 
-	capable = audit ? security_capable(current_cred(), ns, cap) :
-			  security_capable_noaudit(current_cred(), ns, cap);
+	capable = security_capable(current_cred(), ns, cap, opts);
 	if (capable == 0) {
 		current->flags |= PF_SUPERPRIV;
 		return true;
@@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
  */
 bool ns_capable(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, true);
+	return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
 }
 EXPORT_SYMBOL(ns_capable);
 
@@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
  */
 bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, false);
+	return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
@@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
 bool file_ns_capable(const struct file *file, struct user_namespace *ns,
 		     int cap)
 {
+
 	if (WARN_ON_ONCE(!cap_valid(cap)))
 		return false;
 
-	if (security_capable(file->f_cred, ns, cap) == 0)
+	if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
 		return true;
 
 	return false;
@@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
 {
 	int ret = 0;  /* An absent tracer adds no restrictions */
 	const struct cred *cred;
+
 	rcu_read_lock();
 	cred = rcu_dereference(tsk->ptracer_cred);
 	if (cred)
-		ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
+		ret = security_capable(cred, ns, CAP_SYS_PTRACE,
+				       SECURITY_CAP_NOAUDIT);
 	rcu_read_unlock();
 	return (ret == 0);
 }
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f2ae2324c232..ddf615eb1bf7 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 	 * behavior of privileged children.
 	 */
 	if (!task_no_new_privs(current) &&
-	    security_capable_noaudit(current_cred(), current_user_ns(),
-				     CAP_SYS_ADMIN) != 0)
+	    security_capable(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
 		return ERR_PTR(-EACCES);
 
 	/* Allocate a new seccomp_filter */
diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 253ef6e9d445..0f6dca54b66e 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
  * profile_capable - test if profile allows use of capability @cap
  * @profile: profile being enforced    (NOT NULL, NOT unconfined)
  * @cap: capability to test if allowed
- * @audit: whether an audit record should be generated
+ * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
  * @sa: audit data (MAY BE NULL indicating no auditing)
  *
  * Returns: 0 if allowed else -EPERM
  */
-static int profile_capable(struct aa_profile *profile, int cap, int audit,
-			   struct common_audit_data *sa)
+static int profile_capable(struct aa_profile *profile, int cap,
+			   unsigned int opts, struct common_audit_data *sa)
 {
 	int error;
 
@@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
 	else
 		error = -EPERM;
 
-	if (audit == SECURITY_CAP_NOAUDIT) {
+	if (opts & SECURITY_CAP_NOAUDIT) {
 		if (!COMPLAIN_MODE(profile))
 			return error;
 		/* audit the cap request in complain mode but note that it
@@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
  * aa_capable - test permission to use capability
  * @label: label being tested for capability (NOT NULL)
  * @cap: capability to be tested
- * @audit: whether an audit record should be generated
+ * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
  *
  * Look up capability in profile capability set.
  *
  * Returns: 0 on success, or else an error code.
  */
-int aa_capable(struct aa_label *label, int cap, int audit)
+int aa_capable(struct aa_label *label, int cap, unsigned int opts)
 {
 	struct aa_profile *profile;
 	int error = 0;
@@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
 
 	sa.u.cap = cap;
 	error = fn_for_each_confined(label, profile,
-			profile_capable(profile, cap, audit, &sa));
+			profile_capable(profile, cap, opts, &sa));
 
 	return error;
 }
diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
index e0304e2aeb7f..1b3663b6ab12 100644
--- a/security/apparmor/include/capability.h
+++ b/security/apparmor/include/capability.h
@@ -40,7 +40,7 @@ struct aa_caps {
 
 extern struct aa_sfs_entry aa_sfs_entry_caps[];
 
-int aa_capable(struct aa_label *label, int cap, int audit);
+int aa_capable(struct aa_label *label, int cap, unsigned int opts);
 
 static inline void aa_free_cap_rules(struct aa_caps *caps)
 {
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index 527ea1557120..4a1da2313162 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
 	aad(sa)->label = &tracer->label;
 	aad(sa)->peer = tracee;
 	aad(sa)->request = 0;
-	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
+	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
+				    SECURITY_CAP_DEFAULT);
 
 	return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
 }
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 42446a216f3b..0bd817084fc1 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
 }
 
 static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
-			    int cap, int audit)
+			    int cap, unsigned int opts)
 {
 	struct aa_label *label;
 	int error = 0;
 
 	label = aa_get_newest_cred_label(cred);
 	if (!unconfined(label))
-		error = aa_capable(label, cap, audit);
+		error = aa_capable(label, cap, opts);
 	aa_put_label(label);
 
 	return error;
diff --git a/security/commoncap.c b/security/commoncap.c
index 232db019f051..3d8609192e17 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
  * kernel's capable() and has_capability() returns 1 for this case.
  */
 int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
-		int cap, int audit)
+		int cap, unsigned int opts)
 {
 	struct user_namespace *ns = targ_ns;
 
@@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
  */
 static inline int cap_inh_is_capped(void)
 {
-
 	/* they are so limited unless the current task has the CAP_SETPCAP
 	 * capability
 	 */
 	if (cap_capable(current_cred(), current_cred()->user_ns,
-			CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
+			CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
 		return 0;
 	return 1;
 }
@@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 		    || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))	/*[2]*/
 		    || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))	/*[3]*/
 		    || (cap_capable(current_cred(),
-				    current_cred()->user_ns, CAP_SETPCAP,
-				    SECURITY_CAP_AUDIT) != 0)		/*[4]*/
+				    current_cred()->user_ns,
+				    CAP_SETPCAP,
+				    SECURITY_CAP_DEFAULT) != 0)		/*[4]*/
 			/*
 			 * [1] no changing of bits that are locked
 			 * [2] no unlocking of locks
@@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 {
 	int cap_sys_admin = 0;
 
-	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
-			SECURITY_CAP_NOAUDIT) == 0)
+	if (cap_capable(current_cred(), &init_user_ns,
+				CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
 		cap_sys_admin = 1;
+
 	return cap_sys_admin;
 }
 
@@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
 
 	if (addr < dac_mmap_min_addr) {
 		ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
-				  SECURITY_CAP_AUDIT);
+				  SECURITY_CAP_DEFAULT);
 		/* set PF_SUPERPRIV if it turns out we allow the low mmap */
 		if (ret == 0)
 			current->flags |= PF_SUPERPRIV;
diff --git a/security/security.c b/security/security.c
index d670136dda2c..d2334697797a 100644
--- a/security/security.c
+++ b/security/security.c
@@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
 				effective, inheritable, permitted);
 }
 
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-		     int cap)
+int security_capable(const struct cred *cred,
+		     struct user_namespace *ns,
+		     int cap,
+		     unsigned int opts)
 {
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap)
-{
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return call_int_hook(capable, 0, cred, ns, cap, opts);
 }
 
 int security_quotactl(int cmds, int type, int id, struct super_block *sb)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a67459eb62d5..a4b2e49213de 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
 
 /* Check whether a task is allowed to use a capability. */
 static int cred_has_capability(const struct cred *cred,
-			       int cap, int audit, bool initns)
+			       int cap, unsigned int opts, bool initns)
 {
 	struct common_audit_data ad;
 	struct av_decision avd;
@@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
 
 	rc = avc_has_perm_noaudit(&selinux_state,
 				  sid, sid, sclass, av, 0, &avd);
-	if (audit == SECURITY_CAP_AUDIT) {
+	if (!(opts & SECURITY_CAP_NOAUDIT)) {
 		int rc2 = avc_audit(&selinux_state,
 				    sid, sid, sclass, av, &avd, rc, &ad, 0);
 		if (rc2)
@@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
  */
 
 static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
-			   int cap, int audit)
+			   int cap, unsigned int opts)
 {
-	return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
+	return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
 }
 
 static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
@@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
 static bool has_cap_mac_admin(bool audit)
 {
 	const struct cred *cred = current_cred();
-	int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
+	unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
 
-	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
+	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
 		return false;
-	if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
+	if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
 		return false;
 	return true;
 }
@@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
 	case KDSKBENT:
 	case KDSKBSENT:
 		error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
-					    SECURITY_CAP_AUDIT, true);
+					    SECURITY_CAP_DEFAULT, true);
 		break;
 
 	/* default case assumes that the command will go
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 9a4c0ad46518..fac2a21aa7d4 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
 	struct smack_known_list_elem *sklep;
 	int rc;
 
-	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
+	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
 	if (rc)
 		return false;
 
-- 
2.20.0.405.gbc1bbc6f85-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2018-12-18 22:37         ` [PATCH v2] " mortonm
@ 2019-01-07 17:55           ` Micah Morton
  2019-01-07 18:16             ` Casey Schaufler
  2019-01-07 23:13           ` [PATCH v2] " Kees Cook
  1 sibling, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-07 17:55 UTC (permalink / raw)
  To: jmorris, serge, Kees Cook, casey, sds, linux-security-module

Checking in to see if there are any further comments on this patch now
that the holidays are passed? It seems like a straightforward change
to me, but let me know if there is anything I can clarify that isn't
explained by the commit message.

On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> This patch provides a general mechanism for passing flags to the
> security_capable LSM hook. It replaces the specific 'audit' flag that is
> used to tell security_capable whether it should log an audit message for
> the given capability check. The reason for generalizing this flag
> passing is so we can add an additional flag that signifies whether
> security_capable is being called by a setid syscall (which is needed by
> the proposed SafeSetID LSM).
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> Changes since the last patch: Changed the code to use a bitmask instead
> of a struct to represent the options passed to security_capable.
>
>  include/linux/lsm_hooks.h              |  8 +++++---
>  include/linux/security.h               | 28 +++++++++++++-------------
>  kernel/capability.c                    | 22 +++++++++++---------
>  kernel/seccomp.c                       |  4 ++--
>  security/apparmor/capability.c         | 14 ++++++-------
>  security/apparmor/include/capability.h |  2 +-
>  security/apparmor/ipc.c                |  3 ++-
>  security/apparmor/lsm.c                |  4 ++--
>  security/commoncap.c                   | 17 ++++++++--------
>  security/security.c                    | 14 +++++--------
>  security/selinux/hooks.c               | 16 +++++++--------
>  security/smack/smack_access.c          |  2 +-
>  12 files changed, 69 insertions(+), 65 deletions(-)
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index aaeb7fa24dc4..ef955a44a782 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1270,7 +1270,7 @@
>   *     @cred contains the credentials to use.
>   *     @ns contains the user namespace we want the capability in
>   *     @cap contains the capability <include/linux/capability.h>.
> - *     @audit contains whether to write an audit message or not
> + *     @opts contains options for the capable check <include/linux/security.h>
>   *     Return 0 if the capability is granted for @tsk.
>   * @syslog:
>   *     Check permission before accessing the kernel message ring or changing
> @@ -1446,8 +1446,10 @@ union security_list_options {
>                         const kernel_cap_t *effective,
>                         const kernel_cap_t *inheritable,
>                         const kernel_cap_t *permitted);
> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> -                       int cap, int audit);
> +       int (*capable)(const struct cred *cred,
> +                       struct user_namespace *ns,
> +                       int cap,
> +                       unsigned int opts);
>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>         int (*quota_on)(struct dentry *dentry);
>         int (*syslog)(int type);
> diff --git a/include/linux/security.h b/include/linux/security.h
> index d170a5b031f3..038e6779948c 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -54,9 +54,12 @@ struct xattr;
>  struct xfrm_sec_ctx;
>  struct mm_struct;
>
> +/* Default (no) options for the capable function */
> +#define SECURITY_CAP_DEFAULT 0x0
>  /* If capable should audit the security request */
> -#define SECURITY_CAP_NOAUDIT 0
> -#define SECURITY_CAP_AUDIT 1
> +#define SECURITY_CAP_NOAUDIT 0x01
> +/* If capable is being called by a setid function */
> +#define SECURITY_CAP_INSETID 0x02
>
>  /* LSM Agnostic defines for sb_set_mnt_opts */
>  #define SECURITY_LSM_NATIVE_LABELS     1
> @@ -72,7 +75,7 @@ enum lsm_event {
>
>  /* These functions are in security/commoncap.c */
>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> -                      int cap, int audit);
> +                      int cap, unsigned int opts);
>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>  extern int cap_ptrace_traceme(struct task_struct *parent);
> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
>                     const kernel_cap_t *effective,
>                     const kernel_cap_t *inheritable,
>                     const kernel_cap_t *permitted);
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                       int cap);
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap);
> +int security_capable(const struct cred *cred,
> +                      struct user_namespace *ns,
> +                      int cap,
> +                      unsigned int opts);
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>  int security_quota_on(struct dentry *dentry);
>  int security_syslog(int type);
> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
>  }
>
>  static inline int security_capable(const struct cred *cred,
> -                                  struct user_namespace *ns, int cap)
> +                                  struct user_namespace *ns,
> +                                  int cap,
> +                                  unsigned int opts)
>  {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -static inline int security_capable_noaudit(const struct cred *cred,
> -                                          struct user_namespace *ns, int cap) {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return cap_capable(cred, ns, cap, opts);
>  }
>
>  static inline int security_quotactl(int cmds, int type, int id,
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 1e1c0236f55b..454576743b1b 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
>         int ret;
>
>         rcu_read_lock();
> -       ret = security_capable(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>         int ret;
>
>         rcu_read_lock();
> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>  }
>
> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> +static bool ns_capable_common(struct user_namespace *ns,
> +                             int cap,
> +                             unsigned int opts)
>  {
>         int capable;
>
> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>                 BUG();
>         }
>
> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> -                         security_capable_noaudit(current_cred(), ns, cap);
> +       capable = security_capable(current_cred(), ns, cap, opts);
>         if (capable == 0) {
>                 current->flags |= PF_SUPERPRIV;
>                 return true;
> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>   */
>  bool ns_capable(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, true);
> +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
>  }
>  EXPORT_SYMBOL(ns_capable);
>
> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
>   */
>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, false);
> +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
>  }
>  EXPORT_SYMBOL(ns_capable_noaudit);
>
> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>                      int cap)
>  {
> +
>         if (WARN_ON_ONCE(!cap_valid(cap)))
>                 return false;
>
> -       if (security_capable(file->f_cred, ns, cap) == 0)
> +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
>                 return true;
>
>         return false;
> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>  {
>         int ret = 0;  /* An absent tracer adds no restrictions */
>         const struct cred *cred;
> +
>         rcu_read_lock();
>         cred = rcu_dereference(tsk->ptracer_cred);
>         if (cred)
> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> +                                      SECURITY_CAP_NOAUDIT);
>         rcu_read_unlock();
>         return (ret == 0);
>  }
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f2ae2324c232..ddf615eb1bf7 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>          * behavior of privileged children.
>          */
>         if (!task_no_new_privs(current) &&
> -           security_capable_noaudit(current_cred(), current_user_ns(),
> -                                    CAP_SYS_ADMIN) != 0)
> +           security_capable(current_cred(), current_user_ns(),
> +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
>                 return ERR_PTR(-EACCES);
>
>         /* Allocate a new seccomp_filter */
> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> index 253ef6e9d445..0f6dca54b66e 100644
> --- a/security/apparmor/capability.c
> +++ b/security/apparmor/capability.c
> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
>   * profile_capable - test if profile allows use of capability @cap
>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
>   * @cap: capability to test if allowed
> - * @audit: whether an audit record should be generated
> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>   * @sa: audit data (MAY BE NULL indicating no auditing)
>   *
>   * Returns: 0 if allowed else -EPERM
>   */
> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> -                          struct common_audit_data *sa)
> +static int profile_capable(struct aa_profile *profile, int cap,
> +                          unsigned int opts, struct common_audit_data *sa)
>  {
>         int error;
>
> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>         else
>                 error = -EPERM;
>
> -       if (audit == SECURITY_CAP_NOAUDIT) {
> +       if (opts & SECURITY_CAP_NOAUDIT) {
>                 if (!COMPLAIN_MODE(profile))
>                         return error;
>                 /* audit the cap request in complain mode but note that it
> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>   * aa_capable - test permission to use capability
>   * @label: label being tested for capability (NOT NULL)
>   * @cap: capability to be tested
> - * @audit: whether an audit record should be generated
> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>   *
>   * Look up capability in profile capability set.
>   *
>   * Returns: 0 on success, or else an error code.
>   */
> -int aa_capable(struct aa_label *label, int cap, int audit)
> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
>  {
>         struct aa_profile *profile;
>         int error = 0;
> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
>
>         sa.u.cap = cap;
>         error = fn_for_each_confined(label, profile,
> -                       profile_capable(profile, cap, audit, &sa));
> +                       profile_capable(profile, cap, opts, &sa));
>
>         return error;
>  }
> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> index e0304e2aeb7f..1b3663b6ab12 100644
> --- a/security/apparmor/include/capability.h
> +++ b/security/apparmor/include/capability.h
> @@ -40,7 +40,7 @@ struct aa_caps {
>
>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
>
> -int aa_capable(struct aa_label *label, int cap, int audit);
> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
>
>  static inline void aa_free_cap_rules(struct aa_caps *caps)
>  {
> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> index 527ea1557120..4a1da2313162 100644
> --- a/security/apparmor/ipc.c
> +++ b/security/apparmor/ipc.c
> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
>         aad(sa)->label = &tracer->label;
>         aad(sa)->peer = tracee;
>         aad(sa)->request = 0;
> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> +                                   SECURITY_CAP_DEFAULT);
>
>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
>  }
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index 42446a216f3b..0bd817084fc1 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>  }
>
>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> -                           int cap, int audit)
> +                           int cap, unsigned int opts)
>  {
>         struct aa_label *label;
>         int error = 0;
>
>         label = aa_get_newest_cred_label(cred);
>         if (!unconfined(label))
> -               error = aa_capable(label, cap, audit);
> +               error = aa_capable(label, cap, opts);
>         aa_put_label(label);
>
>         return error;
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 232db019f051..3d8609192e17 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>   * kernel's capable() and has_capability() returns 1 for this case.
>   */
>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> -               int cap, int audit)
> +               int cap, unsigned int opts)
>  {
>         struct user_namespace *ns = targ_ns;
>
> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>   */
>  static inline int cap_inh_is_capped(void)
>  {
> -
>         /* they are so limited unless the current task has the CAP_SETPCAP
>          * capability
>          */
>         if (cap_capable(current_cred(), current_cred()->user_ns,
> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
>                 return 0;
>         return 1;
>  }
> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>                     || (cap_capable(current_cred(),
> -                                   current_cred()->user_ns, CAP_SETPCAP,
> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> +                                   current_cred()->user_ns,
> +                                   CAP_SETPCAP,
> +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
>                         /*
>                          * [1] no changing of bits that are locked
>                          * [2] no unlocking of locks
> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>  {
>         int cap_sys_admin = 0;
>
> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> -                       SECURITY_CAP_NOAUDIT) == 0)
> +       if (cap_capable(current_cred(), &init_user_ns,
> +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
>                 cap_sys_admin = 1;
> +
>         return cap_sys_admin;
>  }
>
> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
>
>         if (addr < dac_mmap_min_addr) {
>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> -                                 SECURITY_CAP_AUDIT);
> +                                 SECURITY_CAP_DEFAULT);
>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>                 if (ret == 0)
>                         current->flags |= PF_SUPERPRIV;
> diff --git a/security/security.c b/security/security.c
> index d670136dda2c..d2334697797a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>                                 effective, inheritable, permitted);
>  }
>
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                    int cap)
> +int security_capable(const struct cred *cred,
> +                    struct user_namespace *ns,
> +                    int cap,
> +                    unsigned int opts)
>  {
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap)
> -{
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>  }
>
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index a67459eb62d5..a4b2e49213de 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
>
>  /* Check whether a task is allowed to use a capability. */
>  static int cred_has_capability(const struct cred *cred,
> -                              int cap, int audit, bool initns)
> +                              int cap, unsigned int opts, bool initns)
>  {
>         struct common_audit_data ad;
>         struct av_decision avd;
> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
>
>         rc = avc_has_perm_noaudit(&selinux_state,
>                                   sid, sid, sclass, av, 0, &avd);
> -       if (audit == SECURITY_CAP_AUDIT) {
> +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
>                 int rc2 = avc_audit(&selinux_state,
>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
>                 if (rc2)
> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>   */
>
>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> -                          int cap, int audit)
> +                          int cap, unsigned int opts)
>  {
> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
>  }
>
>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
>  static bool has_cap_mac_admin(bool audit)
>  {
>         const struct cred *cred = current_cred();
> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
>
> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
>                 return false;
> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
>                 return false;
>         return true;
>  }
> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
>         case KDSKBENT:
>         case KDSKBSENT:
>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> -                                           SECURITY_CAP_AUDIT, true);
> +                                           SECURITY_CAP_DEFAULT, true);
>                 break;
>
>         /* default case assumes that the command will go
> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> index 9a4c0ad46518..fac2a21aa7d4 100644
> --- a/security/smack/smack_access.c
> +++ b/security/smack/smack_access.c
> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>         struct smack_known_list_elem *sklep;
>         int rc;
>
> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
>         if (rc)
>                 return false;
>
> --
> 2.20.0.405.gbc1bbc6f85-goog
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2019-01-07 17:55           ` Micah Morton
@ 2019-01-07 18:16             ` Casey Schaufler
  2019-01-07 18:36               ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2019-01-07 18:16 UTC (permalink / raw)
  To: Micah Morton, jmorris, serge, Kees Cook, sds, linux-security-module

On 1/7/2019 9:55 AM, Micah Morton wrote:
> Checking in to see if there are any further comments on this patch now
> that the holidays are passed? It seems like a straightforward change
> to me, but let me know if there is anything I can clarify that isn't
> explained by the commit message.
>
> On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
>> From: Micah Morton <mortonm@chromium.org>
>>
>> This patch provides a general mechanism for passing flags to the
>> security_capable LSM hook. It replaces the specific 'audit' flag that is
>> used to tell security_capable whether it should log an audit message for
>> the given capability check. The reason for generalizing this flag
>> passing is so we can add an additional flag that signifies whether
>> security_capable is being called by a setid syscall (which is needed by
>> the proposed SafeSetID LSM).
>>
>> Signed-off-by: Micah Morton <mortonm@chromium.org>
>> ---
>> Changes since the last patch: Changed the code to use a bitmask instead
>> of a struct to represent the options passed to security_capable.
>>
>>  include/linux/lsm_hooks.h              |  8 +++++---
>>  include/linux/security.h               | 28 +++++++++++++-------------
>>  kernel/capability.c                    | 22 +++++++++++---------
>>  kernel/seccomp.c                       |  4 ++--
>>  security/apparmor/capability.c         | 14 ++++++-------
>>  security/apparmor/include/capability.h |  2 +-
>>  security/apparmor/ipc.c                |  3 ++-
>>  security/apparmor/lsm.c                |  4 ++--
>>  security/commoncap.c                   | 17 ++++++++--------
>>  security/security.c                    | 14 +++++--------
>>  security/selinux/hooks.c               | 16 +++++++--------
>>  security/smack/smack_access.c          |  2 +-
>>  12 files changed, 69 insertions(+), 65 deletions(-)
>>
>> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
>> index aaeb7fa24dc4..ef955a44a782 100644
>> --- a/include/linux/lsm_hooks.h
>> +++ b/include/linux/lsm_hooks.h
>> @@ -1270,7 +1270,7 @@
>>   *     @cred contains the credentials to use.
>>   *     @ns contains the user namespace we want the capability in
>>   *     @cap contains the capability <include/linux/capability.h>.
>> - *     @audit contains whether to write an audit message or not
>> + *     @opts contains options for the capable check <include/linux/security.h>
>>   *     Return 0 if the capability is granted for @tsk.
>>   * @syslog:
>>   *     Check permission before accessing the kernel message ring or changing
>> @@ -1446,8 +1446,10 @@ union security_list_options {
>>                         const kernel_cap_t *effective,
>>                         const kernel_cap_t *inheritable,
>>                         const kernel_cap_t *permitted);
>> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
>> -                       int cap, int audit);
>> +       int (*capable)(const struct cred *cred,
>> +                       struct user_namespace *ns,
>> +                       int cap,
>> +                       unsigned int opts);
>>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>>         int (*quota_on)(struct dentry *dentry);
>>         int (*syslog)(int type);
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index d170a5b031f3..038e6779948c 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -54,9 +54,12 @@ struct xattr;
>>  struct xfrm_sec_ctx;
>>  struct mm_struct;
>>
>> +/* Default (no) options for the capable function */
>> +#define SECURITY_CAP_DEFAULT 0x0
>>  /* If capable should audit the security request */
>> -#define SECURITY_CAP_NOAUDIT 0
>> -#define SECURITY_CAP_AUDIT 1
>> +#define SECURITY_CAP_NOAUDIT 0x01
>> +/* If capable is being called by a setid function */
>> +#define SECURITY_CAP_INSETID 0x02
>>
>>  /* LSM Agnostic defines for sb_set_mnt_opts */
>>  #define SECURITY_LSM_NATIVE_LABELS     1
>> @@ -72,7 +75,7 @@ enum lsm_event {
>>
>>  /* These functions are in security/commoncap.c */
>>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
>> -                      int cap, int audit);
>> +                      int cap, unsigned int opts);
>>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>>  extern int cap_ptrace_traceme(struct task_struct *parent);
>> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
>>                     const kernel_cap_t *effective,
>>                     const kernel_cap_t *inheritable,
>>                     const kernel_cap_t *permitted);
>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
>> -                       int cap);
>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
>> -                            int cap);
>> +int security_capable(const struct cred *cred,
>> +                      struct user_namespace *ns,
>> +                      int cap,
>> +                      unsigned int opts);
>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>>  int security_quota_on(struct dentry *dentry);
>>  int security_syslog(int type);
>> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
>>  }
>>
>>  static inline int security_capable(const struct cred *cred,
>> -                                  struct user_namespace *ns, int cap)
>> +                                  struct user_namespace *ns,
>> +                                  int cap,
>> +                                  unsigned int opts)
>>  {
>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
>> -}
>> -
>> -static inline int security_capable_noaudit(const struct cred *cred,
>> -                                          struct user_namespace *ns, int cap) {
>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
>> +       return cap_capable(cred, ns, cap, opts);
>>  }

Why get rid of security_capable_noaudit()?

>>
>>  static inline int security_quotactl(int cmds, int type, int id,
>> diff --git a/kernel/capability.c b/kernel/capability.c
>> index 1e1c0236f55b..454576743b1b 100644
>> --- a/kernel/capability.c
>> +++ b/kernel/capability.c
>> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
>>         int ret;
>>
>>         rcu_read_lock();
>> -       ret = security_capable(__task_cred(t), ns, cap);
>> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
>>         rcu_read_unlock();
>>
>>         return (ret == 0);
>> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>>         int ret;
>>
>>         rcu_read_lock();
>> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
>> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
>>         rcu_read_unlock();
>>
>>         return (ret == 0);
>> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>>  }
>>
>> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>> +static bool ns_capable_common(struct user_namespace *ns,
>> +                             int cap,
>> +                             unsigned int opts)
>>  {
>>         int capable;
>>
>> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>                 BUG();
>>         }
>>
>> -       capable = audit ? security_capable(current_cred(), ns, cap) :
>> -                         security_capable_noaudit(current_cred(), ns, cap);
>> +       capable = security_capable(current_cred(), ns, cap, opts);
>>         if (capable == 0) {
>>                 current->flags |= PF_SUPERPRIV;
>>                 return true;
>> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>   */
>>  bool ns_capable(struct user_namespace *ns, int cap)
>>  {
>> -       return ns_capable_common(ns, cap, true);
>> +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
>>  }
>>  EXPORT_SYMBOL(ns_capable);
>>
>> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
>>   */
>>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>>  {
>> -       return ns_capable_common(ns, cap, false);
>> +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
>>  }
>>  EXPORT_SYMBOL(ns_capable_noaudit);
>>
>> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
>>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>>                      int cap)
>>  {
>> +
>>         if (WARN_ON_ONCE(!cap_valid(cap)))
>>                 return false;
>>
>> -       if (security_capable(file->f_cred, ns, cap) == 0)
>> +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
>>                 return true;
>>
>>         return false;
>> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>>  {
>>         int ret = 0;  /* An absent tracer adds no restrictions */
>>         const struct cred *cred;
>> +
>>         rcu_read_lock();
>>         cred = rcu_dereference(tsk->ptracer_cred);
>>         if (cred)
>> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
>> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
>> +                                      SECURITY_CAP_NOAUDIT);
>>         rcu_read_unlock();
>>         return (ret == 0);
>>  }
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index f2ae2324c232..ddf615eb1bf7 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>          * behavior of privileged children.
>>          */
>>         if (!task_no_new_privs(current) &&
>> -           security_capable_noaudit(current_cred(), current_user_ns(),
>> -                                    CAP_SYS_ADMIN) != 0)
>> +           security_capable(current_cred(), current_user_ns(),
>> +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
>>                 return ERR_PTR(-EACCES);
>>
>>         /* Allocate a new seccomp_filter */
>> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
>> index 253ef6e9d445..0f6dca54b66e 100644
>> --- a/security/apparmor/capability.c
>> +++ b/security/apparmor/capability.c
>> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
>>   * profile_capable - test if profile allows use of capability @cap
>>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
>>   * @cap: capability to test if allowed
>> - * @audit: whether an audit record should be generated
>> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>>   * @sa: audit data (MAY BE NULL indicating no auditing)
>>   *
>>   * Returns: 0 if allowed else -EPERM
>>   */
>> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
>> -                          struct common_audit_data *sa)
>> +static int profile_capable(struct aa_profile *profile, int cap,
>> +                          unsigned int opts, struct common_audit_data *sa)
>>  {
>>         int error;
>>
>> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>>         else
>>                 error = -EPERM;
>>
>> -       if (audit == SECURITY_CAP_NOAUDIT) {
>> +       if (opts & SECURITY_CAP_NOAUDIT) {
>>                 if (!COMPLAIN_MODE(profile))
>>                         return error;
>>                 /* audit the cap request in complain mode but note that it
>> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>>   * aa_capable - test permission to use capability
>>   * @label: label being tested for capability (NOT NULL)
>>   * @cap: capability to be tested
>> - * @audit: whether an audit record should be generated
>> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>>   *
>>   * Look up capability in profile capability set.
>>   *
>>   * Returns: 0 on success, or else an error code.
>>   */
>> -int aa_capable(struct aa_label *label, int cap, int audit)
>> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
>>  {
>>         struct aa_profile *profile;
>>         int error = 0;
>> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
>>
>>         sa.u.cap = cap;
>>         error = fn_for_each_confined(label, profile,
>> -                       profile_capable(profile, cap, audit, &sa));
>> +                       profile_capable(profile, cap, opts, &sa));
>>
>>         return error;
>>  }
>> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
>> index e0304e2aeb7f..1b3663b6ab12 100644
>> --- a/security/apparmor/include/capability.h
>> +++ b/security/apparmor/include/capability.h
>> @@ -40,7 +40,7 @@ struct aa_caps {
>>
>>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
>>
>> -int aa_capable(struct aa_label *label, int cap, int audit);
>> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
>>
>>  static inline void aa_free_cap_rules(struct aa_caps *caps)
>>  {
>> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
>> index 527ea1557120..4a1da2313162 100644
>> --- a/security/apparmor/ipc.c
>> +++ b/security/apparmor/ipc.c
>> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
>>         aad(sa)->label = &tracer->label;
>>         aad(sa)->peer = tracee;
>>         aad(sa)->request = 0;
>> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
>> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
>> +                                   SECURITY_CAP_DEFAULT);
>>
>>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
>>  }
>> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
>> index 42446a216f3b..0bd817084fc1 100644
>> --- a/security/apparmor/lsm.c
>> +++ b/security/apparmor/lsm.c
>> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>>  }
>>
>>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
>> -                           int cap, int audit)
>> +                           int cap, unsigned int opts)
>>  {
>>         struct aa_label *label;
>>         int error = 0;
>>
>>         label = aa_get_newest_cred_label(cred);
>>         if (!unconfined(label))
>> -               error = aa_capable(label, cap, audit);
>> +               error = aa_capable(label, cap, opts);
>>         aa_put_label(label);
>>
>>         return error;
>> diff --git a/security/commoncap.c b/security/commoncap.c
>> index 232db019f051..3d8609192e17 100644
>> --- a/security/commoncap.c
>> +++ b/security/commoncap.c
>> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>>   * kernel's capable() and has_capability() returns 1 for this case.
>>   */
>>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
>> -               int cap, int audit)
>> +               int cap, unsigned int opts)
>>  {
>>         struct user_namespace *ns = targ_ns;
>>
>> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>>   */
>>  static inline int cap_inh_is_capped(void)
>>  {
>> -
>>         /* they are so limited unless the current task has the CAP_SETPCAP
>>          * capability
>>          */
>>         if (cap_capable(current_cred(), current_cred()->user_ns,
>> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
>> +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
>>                 return 0;
>>         return 1;
>>  }
>> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>>                     || (cap_capable(current_cred(),
>> -                                   current_cred()->user_ns, CAP_SETPCAP,
>> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
>> +                                   current_cred()->user_ns,
>> +                                   CAP_SETPCAP,
>> +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
>>                         /*
>>                          * [1] no changing of bits that are locked
>>                          * [2] no unlocking of locks
>> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>>  {
>>         int cap_sys_admin = 0;
>>
>> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
>> -                       SECURITY_CAP_NOAUDIT) == 0)
>> +       if (cap_capable(current_cred(), &init_user_ns,
>> +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
>>                 cap_sys_admin = 1;
>> +
>>         return cap_sys_admin;
>>  }
>>
>> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
>>
>>         if (addr < dac_mmap_min_addr) {
>>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
>> -                                 SECURITY_CAP_AUDIT);
>> +                                 SECURITY_CAP_DEFAULT);
>>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>>                 if (ret == 0)
>>                         current->flags |= PF_SUPERPRIV;
>> diff --git a/security/security.c b/security/security.c
>> index d670136dda2c..d2334697797a 100644
>> --- a/security/security.c
>> +++ b/security/security.c
>> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>>                                 effective, inheritable, permitted);
>>  }
>>
>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
>> -                    int cap)
>> +int security_capable(const struct cred *cred,
>> +                    struct user_namespace *ns,
>> +                    int cap,
>> +                    unsigned int opts)
>>  {
>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
>> -}
>> -
>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
>> -                            int cap)
>> -{
>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
>> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>>  }
>>
>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
>> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
>> index a67459eb62d5..a4b2e49213de 100644
>> --- a/security/selinux/hooks.c
>> +++ b/security/selinux/hooks.c
>> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
>>
>>  /* Check whether a task is allowed to use a capability. */
>>  static int cred_has_capability(const struct cred *cred,
>> -                              int cap, int audit, bool initns)
>> +                              int cap, unsigned int opts, bool initns)
>>  {
>>         struct common_audit_data ad;
>>         struct av_decision avd;
>> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
>>
>>         rc = avc_has_perm_noaudit(&selinux_state,
>>                                   sid, sid, sclass, av, 0, &avd);
>> -       if (audit == SECURITY_CAP_AUDIT) {
>> +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
>>                 int rc2 = avc_audit(&selinux_state,
>>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
>>                 if (rc2)
>> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>>   */
>>
>>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
>> -                          int cap, int audit)
>> +                          int cap, unsigned int opts)
>>  {
>> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
>> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
>>  }
>>
>>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
>> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
>>  static bool has_cap_mac_admin(bool audit)
>>  {
>>         const struct cred *cred = current_cred();
>> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
>> +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
>>
>> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
>> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
>>                 return false;
>> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
>> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
>>                 return false;
>>         return true;
>>  }
>> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
>>         case KDSKBENT:
>>         case KDSKBSENT:
>>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
>> -                                           SECURITY_CAP_AUDIT, true);
>> +                                           SECURITY_CAP_DEFAULT, true);
>>                 break;
>>
>>         /* default case assumes that the command will go
>> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
>> index 9a4c0ad46518..fac2a21aa7d4 100644
>> --- a/security/smack/smack_access.c
>> +++ b/security/smack/smack_access.c
>> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>>         struct smack_known_list_elem *sklep;
>>         int rc;
>>
>> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
>> +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
>>         if (rc)
>>                 return false;
>>
>> --
>> 2.20.0.405.gbc1bbc6f85-goog
>>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2019-01-07 18:16             ` Casey Schaufler
@ 2019-01-07 18:36               ` Micah Morton
  2019-01-07 18:46                 ` Casey Schaufler
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-07 18:36 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: jmorris, serge, Kees Cook, sds, linux-security-module

It seems a bit weird to me to keep security_capable_noaudit and not
add the analogous "security_capable_insetid" function (or other
one-off functions if/when people want to pass new flags to
security_capable). Taking away the function doesn't complicate the
callers in any way I can see, and somewhat cleans up the logic in at
lease one case (ns_capable_common in kernel/capability.c) since
callers can just modify the last param in security_capable rather than
calling different functions for audit vs. noaudit. I guess my take is
why keep "security_capable_noaudit" when it is easy to just call
"security_capable" with the SECURITY_CAP_NOAUDIT flag? I have no
strong preference here so I'll do whatever seems best.

On Mon, Jan 7, 2019 at 10:16 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 1/7/2019 9:55 AM, Micah Morton wrote:
> > Checking in to see if there are any further comments on this patch now
> > that the holidays are passed? It seems like a straightforward change
> > to me, but let me know if there is anything I can clarify that isn't
> > explained by the commit message.
> >
> > On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
> >> From: Micah Morton <mortonm@chromium.org>
> >>
> >> This patch provides a general mechanism for passing flags to the
> >> security_capable LSM hook. It replaces the specific 'audit' flag that is
> >> used to tell security_capable whether it should log an audit message for
> >> the given capability check. The reason for generalizing this flag
> >> passing is so we can add an additional flag that signifies whether
> >> security_capable is being called by a setid syscall (which is needed by
> >> the proposed SafeSetID LSM).
> >>
> >> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >> ---
> >> Changes since the last patch: Changed the code to use a bitmask instead
> >> of a struct to represent the options passed to security_capable.
> >>
> >>  include/linux/lsm_hooks.h              |  8 +++++---
> >>  include/linux/security.h               | 28 +++++++++++++-------------
> >>  kernel/capability.c                    | 22 +++++++++++---------
> >>  kernel/seccomp.c                       |  4 ++--
> >>  security/apparmor/capability.c         | 14 ++++++-------
> >>  security/apparmor/include/capability.h |  2 +-
> >>  security/apparmor/ipc.c                |  3 ++-
> >>  security/apparmor/lsm.c                |  4 ++--
> >>  security/commoncap.c                   | 17 ++++++++--------
> >>  security/security.c                    | 14 +++++--------
> >>  security/selinux/hooks.c               | 16 +++++++--------
> >>  security/smack/smack_access.c          |  2 +-
> >>  12 files changed, 69 insertions(+), 65 deletions(-)
> >>
> >> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> >> index aaeb7fa24dc4..ef955a44a782 100644
> >> --- a/include/linux/lsm_hooks.h
> >> +++ b/include/linux/lsm_hooks.h
> >> @@ -1270,7 +1270,7 @@
> >>   *     @cred contains the credentials to use.
> >>   *     @ns contains the user namespace we want the capability in
> >>   *     @cap contains the capability <include/linux/capability.h>.
> >> - *     @audit contains whether to write an audit message or not
> >> + *     @opts contains options for the capable check <include/linux/security.h>
> >>   *     Return 0 if the capability is granted for @tsk.
> >>   * @syslog:
> >>   *     Check permission before accessing the kernel message ring or changing
> >> @@ -1446,8 +1446,10 @@ union security_list_options {
> >>                         const kernel_cap_t *effective,
> >>                         const kernel_cap_t *inheritable,
> >>                         const kernel_cap_t *permitted);
> >> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> >> -                       int cap, int audit);
> >> +       int (*capable)(const struct cred *cred,
> >> +                       struct user_namespace *ns,
> >> +                       int cap,
> >> +                       unsigned int opts);
> >>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
> >>         int (*quota_on)(struct dentry *dentry);
> >>         int (*syslog)(int type);
> >> diff --git a/include/linux/security.h b/include/linux/security.h
> >> index d170a5b031f3..038e6779948c 100644
> >> --- a/include/linux/security.h
> >> +++ b/include/linux/security.h
> >> @@ -54,9 +54,12 @@ struct xattr;
> >>  struct xfrm_sec_ctx;
> >>  struct mm_struct;
> >>
> >> +/* Default (no) options for the capable function */
> >> +#define SECURITY_CAP_DEFAULT 0x0
> >>  /* If capable should audit the security request */
> >> -#define SECURITY_CAP_NOAUDIT 0
> >> -#define SECURITY_CAP_AUDIT 1
> >> +#define SECURITY_CAP_NOAUDIT 0x01
> >> +/* If capable is being called by a setid function */
> >> +#define SECURITY_CAP_INSETID 0x02
> >>
> >>  /* LSM Agnostic defines for sb_set_mnt_opts */
> >>  #define SECURITY_LSM_NATIVE_LABELS     1
> >> @@ -72,7 +75,7 @@ enum lsm_event {
> >>
> >>  /* These functions are in security/commoncap.c */
> >>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                      int cap, int audit);
> >> +                      int cap, unsigned int opts);
> >>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
> >>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
> >>  extern int cap_ptrace_traceme(struct task_struct *parent);
> >> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
> >>                     const kernel_cap_t *effective,
> >>                     const kernel_cap_t *inheritable,
> >>                     const kernel_cap_t *permitted);
> >> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                       int cap);
> >> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> >> -                            int cap);
> >> +int security_capable(const struct cred *cred,
> >> +                      struct user_namespace *ns,
> >> +                      int cap,
> >> +                      unsigned int opts);
> >>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
> >>  int security_quota_on(struct dentry *dentry);
> >>  int security_syslog(int type);
> >> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
> >>  }
> >>
> >>  static inline int security_capable(const struct cred *cred,
> >> -                                  struct user_namespace *ns, int cap)
> >> +                                  struct user_namespace *ns,
> >> +                                  int cap,
> >> +                                  unsigned int opts)
> >>  {
> >> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> >> -}
> >> -
> >> -static inline int security_capable_noaudit(const struct cred *cred,
> >> -                                          struct user_namespace *ns, int cap) {
> >> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> >> +       return cap_capable(cred, ns, cap, opts);
> >>  }
>
> Why get rid of security_capable_noaudit()?
>
> >>
> >>  static inline int security_quotactl(int cmds, int type, int id,
> >> diff --git a/kernel/capability.c b/kernel/capability.c
> >> index 1e1c0236f55b..454576743b1b 100644
> >> --- a/kernel/capability.c
> >> +++ b/kernel/capability.c
> >> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
> >>         int ret;
> >>
> >>         rcu_read_lock();
> >> -       ret = security_capable(__task_cred(t), ns, cap);
> >> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
> >>         rcu_read_unlock();
> >>
> >>         return (ret == 0);
> >> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
> >>         int ret;
> >>
> >>         rcu_read_lock();
> >> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> >> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
> >>         rcu_read_unlock();
> >>
> >>         return (ret == 0);
> >> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
> >>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
> >>  }
> >>
> >> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >> +static bool ns_capable_common(struct user_namespace *ns,
> >> +                             int cap,
> >> +                             unsigned int opts)
> >>  {
> >>         int capable;
> >>
> >> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>                 BUG();
> >>         }
> >>
> >> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> >> -                         security_capable_noaudit(current_cred(), ns, cap);
> >> +       capable = security_capable(current_cred(), ns, cap, opts);
> >>         if (capable == 0) {
> >>                 current->flags |= PF_SUPERPRIV;
> >>                 return true;
> >> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>   */
> >>  bool ns_capable(struct user_namespace *ns, int cap)
> >>  {
> >> -       return ns_capable_common(ns, cap, true);
> >> +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
> >>  }
> >>  EXPORT_SYMBOL(ns_capable);
> >>
> >> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
> >>   */
> >>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> >>  {
> >> -       return ns_capable_common(ns, cap, false);
> >> +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
> >>  }
> >>  EXPORT_SYMBOL(ns_capable_noaudit);
> >>
> >> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
> >>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
> >>                      int cap)
> >>  {
> >> +
> >>         if (WARN_ON_ONCE(!cap_valid(cap)))
> >>                 return false;
> >>
> >> -       if (security_capable(file->f_cred, ns, cap) == 0)
> >> +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
> >>                 return true;
> >>
> >>         return false;
> >> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
> >>  {
> >>         int ret = 0;  /* An absent tracer adds no restrictions */
> >>         const struct cred *cred;
> >> +
> >>         rcu_read_lock();
> >>         cred = rcu_dereference(tsk->ptracer_cred);
> >>         if (cred)
> >> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> >> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> >> +                                      SECURITY_CAP_NOAUDIT);
> >>         rcu_read_unlock();
> >>         return (ret == 0);
> >>  }
> >> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> >> index f2ae2324c232..ddf615eb1bf7 100644
> >> --- a/kernel/seccomp.c
> >> +++ b/kernel/seccomp.c
> >> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >>          * behavior of privileged children.
> >>          */
> >>         if (!task_no_new_privs(current) &&
> >> -           security_capable_noaudit(current_cred(), current_user_ns(),
> >> -                                    CAP_SYS_ADMIN) != 0)
> >> +           security_capable(current_cred(), current_user_ns(),
> >> +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
> >>                 return ERR_PTR(-EACCES);
> >>
> >>         /* Allocate a new seccomp_filter */
> >> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> >> index 253ef6e9d445..0f6dca54b66e 100644
> >> --- a/security/apparmor/capability.c
> >> +++ b/security/apparmor/capability.c
> >> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
> >>   * profile_capable - test if profile allows use of capability @cap
> >>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
> >>   * @cap: capability to test if allowed
> >> - * @audit: whether an audit record should be generated
> >> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
> >>   * @sa: audit data (MAY BE NULL indicating no auditing)
> >>   *
> >>   * Returns: 0 if allowed else -EPERM
> >>   */
> >> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >> -                          struct common_audit_data *sa)
> >> +static int profile_capable(struct aa_profile *profile, int cap,
> >> +                          unsigned int opts, struct common_audit_data *sa)
> >>  {
> >>         int error;
> >>
> >> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >>         else
> >>                 error = -EPERM;
> >>
> >> -       if (audit == SECURITY_CAP_NOAUDIT) {
> >> +       if (opts & SECURITY_CAP_NOAUDIT) {
> >>                 if (!COMPLAIN_MODE(profile))
> >>                         return error;
> >>                 /* audit the cap request in complain mode but note that it
> >> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >>   * aa_capable - test permission to use capability
> >>   * @label: label being tested for capability (NOT NULL)
> >>   * @cap: capability to be tested
> >> - * @audit: whether an audit record should be generated
> >> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
> >>   *
> >>   * Look up capability in profile capability set.
> >>   *
> >>   * Returns: 0 on success, or else an error code.
> >>   */
> >> -int aa_capable(struct aa_label *label, int cap, int audit)
> >> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
> >>  {
> >>         struct aa_profile *profile;
> >>         int error = 0;
> >> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
> >>
> >>         sa.u.cap = cap;
> >>         error = fn_for_each_confined(label, profile,
> >> -                       profile_capable(profile, cap, audit, &sa));
> >> +                       profile_capable(profile, cap, opts, &sa));
> >>
> >>         return error;
> >>  }
> >> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> >> index e0304e2aeb7f..1b3663b6ab12 100644
> >> --- a/security/apparmor/include/capability.h
> >> +++ b/security/apparmor/include/capability.h
> >> @@ -40,7 +40,7 @@ struct aa_caps {
> >>
> >>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
> >>
> >> -int aa_capable(struct aa_label *label, int cap, int audit);
> >> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
> >>
> >>  static inline void aa_free_cap_rules(struct aa_caps *caps)
> >>  {
> >> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> >> index 527ea1557120..4a1da2313162 100644
> >> --- a/security/apparmor/ipc.c
> >> +++ b/security/apparmor/ipc.c
> >> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
> >>         aad(sa)->label = &tracer->label;
> >>         aad(sa)->peer = tracee;
> >>         aad(sa)->request = 0;
> >> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> >> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> >> +                                   SECURITY_CAP_DEFAULT);
> >>
> >>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
> >>  }
> >> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> >> index 42446a216f3b..0bd817084fc1 100644
> >> --- a/security/apparmor/lsm.c
> >> +++ b/security/apparmor/lsm.c
> >> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
> >>  }
> >>
> >>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                           int cap, int audit)
> >> +                           int cap, unsigned int opts)
> >>  {
> >>         struct aa_label *label;
> >>         int error = 0;
> >>
> >>         label = aa_get_newest_cred_label(cred);
> >>         if (!unconfined(label))
> >> -               error = aa_capable(label, cap, audit);
> >> +               error = aa_capable(label, cap, opts);
> >>         aa_put_label(label);
> >>
> >>         return error;
> >> diff --git a/security/commoncap.c b/security/commoncap.c
> >> index 232db019f051..3d8609192e17 100644
> >> --- a/security/commoncap.c
> >> +++ b/security/commoncap.c
> >> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
> >>   * kernel's capable() and has_capability() returns 1 for this case.
> >>   */
> >>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> >> -               int cap, int audit)
> >> +               int cap, unsigned int opts)
> >>  {
> >>         struct user_namespace *ns = targ_ns;
> >>
> >> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
> >>   */
> >>  static inline int cap_inh_is_capped(void)
> >>  {
> >> -
> >>         /* they are so limited unless the current task has the CAP_SETPCAP
> >>          * capability
> >>          */
> >>         if (cap_capable(current_cred(), current_cred()->user_ns,
> >> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> >> +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
> >>                 return 0;
> >>         return 1;
> >>  }
> >> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
> >>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
> >>                     || (cap_capable(current_cred(),
> >> -                                   current_cred()->user_ns, CAP_SETPCAP,
> >> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> >> +                                   current_cred()->user_ns,
> >> +                                   CAP_SETPCAP,
> >> +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
> >>                         /*
> >>                          * [1] no changing of bits that are locked
> >>                          * [2] no unlocking of locks
> >> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
> >>  {
> >>         int cap_sys_admin = 0;
> >>
> >> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> >> -                       SECURITY_CAP_NOAUDIT) == 0)
> >> +       if (cap_capable(current_cred(), &init_user_ns,
> >> +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
> >>                 cap_sys_admin = 1;
> >> +
> >>         return cap_sys_admin;
> >>  }
> >>
> >> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
> >>
> >>         if (addr < dac_mmap_min_addr) {
> >>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> >> -                                 SECURITY_CAP_AUDIT);
> >> +                                 SECURITY_CAP_DEFAULT);
> >>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
> >>                 if (ret == 0)
> >>                         current->flags |= PF_SUPERPRIV;
> >> diff --git a/security/security.c b/security/security.c
> >> index d670136dda2c..d2334697797a 100644
> >> --- a/security/security.c
> >> +++ b/security/security.c
> >> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
> >>                                 effective, inheritable, permitted);
> >>  }
> >>
> >> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                    int cap)
> >> +int security_capable(const struct cred *cred,
> >> +                    struct user_namespace *ns,
> >> +                    int cap,
> >> +                    unsigned int opts)
> >>  {
> >> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> >> -}
> >> -
> >> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> >> -                            int cap)
> >> -{
> >> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> >> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
> >>  }
> >>
> >>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> >> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> >> index a67459eb62d5..a4b2e49213de 100644
> >> --- a/security/selinux/hooks.c
> >> +++ b/security/selinux/hooks.c
> >> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
> >>
> >>  /* Check whether a task is allowed to use a capability. */
> >>  static int cred_has_capability(const struct cred *cred,
> >> -                              int cap, int audit, bool initns)
> >> +                              int cap, unsigned int opts, bool initns)
> >>  {
> >>         struct common_audit_data ad;
> >>         struct av_decision avd;
> >> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
> >>
> >>         rc = avc_has_perm_noaudit(&selinux_state,
> >>                                   sid, sid, sclass, av, 0, &avd);
> >> -       if (audit == SECURITY_CAP_AUDIT) {
> >> +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
> >>                 int rc2 = avc_audit(&selinux_state,
> >>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
> >>                 if (rc2)
> >> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
> >>   */
> >>
> >>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> >> -                          int cap, int audit)
> >> +                          int cap, unsigned int opts)
> >>  {
> >> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> >> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
> >>  }
> >>
> >>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> >> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
> >>  static bool has_cap_mac_admin(bool audit)
> >>  {
> >>         const struct cred *cred = current_cred();
> >> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> >> +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
> >>
> >> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> >> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
> >>                 return false;
> >> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> >> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
> >>                 return false;
> >>         return true;
> >>  }
> >> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
> >>         case KDSKBENT:
> >>         case KDSKBSENT:
> >>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> >> -                                           SECURITY_CAP_AUDIT, true);
> >> +                                           SECURITY_CAP_DEFAULT, true);
> >>                 break;
> >>
> >>         /* default case assumes that the command will go
> >> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> >> index 9a4c0ad46518..fac2a21aa7d4 100644
> >> --- a/security/smack/smack_access.c
> >> +++ b/security/smack/smack_access.c
> >> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
> >>         struct smack_known_list_elem *sklep;
> >>         int rc;
> >>
> >> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> >> +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
> >>         if (rc)
> >>                 return false;
> >>
> >> --
> >> 2.20.0.405.gbc1bbc6f85-goog
> >>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2019-01-07 18:36               ` Micah Morton
@ 2019-01-07 18:46                 ` Casey Schaufler
  2019-01-07 19:02                   ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2019-01-07 18:46 UTC (permalink / raw)
  To: Micah Morton; +Cc: jmorris, serge, Kees Cook, sds, linux-security-module

On 1/7/2019 10:36 AM, Micah Morton wrote:
> It seems a bit weird to me to keep security_capable_noaudit and not
> add the analogous "security_capable_insetid" function (or other
> one-off functions if/when people want to pass new flags to
> security_capable). Taking away the function doesn't complicate the
> callers in any way I can see, and somewhat cleans up the logic in at
> lease one case (ns_capable_common in kernel/capability.c) since
> callers can just modify the last param in security_capable rather than
> calling different functions for audit vs. noaudit. I guess my take is
> why keep "security_capable_noaudit" when it is easy to just call
> "security_capable" with the SECURITY_CAP_NOAUDIT flag? I have no
> strong preference here so I'll do whatever seems best.

My only reason to suggest keeping the function is to reduce
code churn. I would think that whoever introduced the noaudit
version had a reason to do that. It probably isn't a big deal.
I don't have a lot of energy on the issue, but it would make
your patch a bit smaller, and impact a lot fewer files.

>
> On Mon, Jan 7, 2019 at 10:16 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 1/7/2019 9:55 AM, Micah Morton wrote:
>>> Checking in to see if there are any further comments on this patch now
>>> that the holidays are passed? It seems like a straightforward change
>>> to me, but let me know if there is anything I can clarify that isn't
>>> explained by the commit message.
>>>
>>> On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
>>>> From: Micah Morton <mortonm@chromium.org>
>>>>
>>>> This patch provides a general mechanism for passing flags to the
>>>> security_capable LSM hook. It replaces the specific 'audit' flag that is
>>>> used to tell security_capable whether it should log an audit message for
>>>> the given capability check. The reason for generalizing this flag
>>>> passing is so we can add an additional flag that signifies whether
>>>> security_capable is being called by a setid syscall (which is needed by
>>>> the proposed SafeSetID LSM).
>>>>
>>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
>>>> ---
>>>> Changes since the last patch: Changed the code to use a bitmask instead
>>>> of a struct to represent the options passed to security_capable.
>>>>
>>>>  include/linux/lsm_hooks.h              |  8 +++++---
>>>>  include/linux/security.h               | 28 +++++++++++++-------------
>>>>  kernel/capability.c                    | 22 +++++++++++---------
>>>>  kernel/seccomp.c                       |  4 ++--
>>>>  security/apparmor/capability.c         | 14 ++++++-------
>>>>  security/apparmor/include/capability.h |  2 +-
>>>>  security/apparmor/ipc.c                |  3 ++-
>>>>  security/apparmor/lsm.c                |  4 ++--
>>>>  security/commoncap.c                   | 17 ++++++++--------
>>>>  security/security.c                    | 14 +++++--------
>>>>  security/selinux/hooks.c               | 16 +++++++--------
>>>>  security/smack/smack_access.c          |  2 +-
>>>>  12 files changed, 69 insertions(+), 65 deletions(-)
>>>>
>>>> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
>>>> index aaeb7fa24dc4..ef955a44a782 100644
>>>> --- a/include/linux/lsm_hooks.h
>>>> +++ b/include/linux/lsm_hooks.h
>>>> @@ -1270,7 +1270,7 @@
>>>>   *     @cred contains the credentials to use.
>>>>   *     @ns contains the user namespace we want the capability in
>>>>   *     @cap contains the capability <include/linux/capability.h>.
>>>> - *     @audit contains whether to write an audit message or not
>>>> + *     @opts contains options for the capable check <include/linux/security.h>
>>>>   *     Return 0 if the capability is granted for @tsk.
>>>>   * @syslog:
>>>>   *     Check permission before accessing the kernel message ring or changing
>>>> @@ -1446,8 +1446,10 @@ union security_list_options {
>>>>                         const kernel_cap_t *effective,
>>>>                         const kernel_cap_t *inheritable,
>>>>                         const kernel_cap_t *permitted);
>>>> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
>>>> -                       int cap, int audit);
>>>> +       int (*capable)(const struct cred *cred,
>>>> +                       struct user_namespace *ns,
>>>> +                       int cap,
>>>> +                       unsigned int opts);
>>>>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>>>>         int (*quota_on)(struct dentry *dentry);
>>>>         int (*syslog)(int type);
>>>> diff --git a/include/linux/security.h b/include/linux/security.h
>>>> index d170a5b031f3..038e6779948c 100644
>>>> --- a/include/linux/security.h
>>>> +++ b/include/linux/security.h
>>>> @@ -54,9 +54,12 @@ struct xattr;
>>>>  struct xfrm_sec_ctx;
>>>>  struct mm_struct;
>>>>
>>>> +/* Default (no) options for the capable function */
>>>> +#define SECURITY_CAP_DEFAULT 0x0
>>>>  /* If capable should audit the security request */
>>>> -#define SECURITY_CAP_NOAUDIT 0
>>>> -#define SECURITY_CAP_AUDIT 1
>>>> +#define SECURITY_CAP_NOAUDIT 0x01
>>>> +/* If capable is being called by a setid function */
>>>> +#define SECURITY_CAP_INSETID 0x02
>>>>
>>>>  /* LSM Agnostic defines for sb_set_mnt_opts */
>>>>  #define SECURITY_LSM_NATIVE_LABELS     1
>>>> @@ -72,7 +75,7 @@ enum lsm_event {
>>>>
>>>>  /* These functions are in security/commoncap.c */
>>>>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
>>>> -                      int cap, int audit);
>>>> +                      int cap, unsigned int opts);
>>>>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>>>>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>>>>  extern int cap_ptrace_traceme(struct task_struct *parent);
>>>> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
>>>>                     const kernel_cap_t *effective,
>>>>                     const kernel_cap_t *inheritable,
>>>>                     const kernel_cap_t *permitted);
>>>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
>>>> -                       int cap);
>>>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
>>>> -                            int cap);
>>>> +int security_capable(const struct cred *cred,
>>>> +                      struct user_namespace *ns,
>>>> +                      int cap,
>>>> +                      unsigned int opts);
>>>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>>>>  int security_quota_on(struct dentry *dentry);
>>>>  int security_syslog(int type);
>>>> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
>>>>  }
>>>>
>>>>  static inline int security_capable(const struct cred *cred,
>>>> -                                  struct user_namespace *ns, int cap)
>>>> +                                  struct user_namespace *ns,
>>>> +                                  int cap,
>>>> +                                  unsigned int opts)
>>>>  {
>>>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
>>>> -}
>>>> -
>>>> -static inline int security_capable_noaudit(const struct cred *cred,
>>>> -                                          struct user_namespace *ns, int cap) {
>>>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
>>>> +       return cap_capable(cred, ns, cap, opts);
>>>>  }
>> Why get rid of security_capable_noaudit()?
>>
>>>>  static inline int security_quotactl(int cmds, int type, int id,
>>>> diff --git a/kernel/capability.c b/kernel/capability.c
>>>> index 1e1c0236f55b..454576743b1b 100644
>>>> --- a/kernel/capability.c
>>>> +++ b/kernel/capability.c
>>>> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
>>>>         int ret;
>>>>
>>>>         rcu_read_lock();
>>>> -       ret = security_capable(__task_cred(t), ns, cap);
>>>> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
>>>>         rcu_read_unlock();
>>>>
>>>>         return (ret == 0);
>>>> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>>>>         int ret;
>>>>
>>>>         rcu_read_lock();
>>>> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
>>>> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
>>>>         rcu_read_unlock();
>>>>
>>>>         return (ret == 0);
>>>> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>>>>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>>>>  }
>>>>
>>>> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>>> +static bool ns_capable_common(struct user_namespace *ns,
>>>> +                             int cap,
>>>> +                             unsigned int opts)
>>>>  {
>>>>         int capable;
>>>>
>>>> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>>>                 BUG();
>>>>         }
>>>>
>>>> -       capable = audit ? security_capable(current_cred(), ns, cap) :
>>>> -                         security_capable_noaudit(current_cred(), ns, cap);
>>>> +       capable = security_capable(current_cred(), ns, cap, opts);
>>>>         if (capable == 0) {
>>>>                 current->flags |= PF_SUPERPRIV;
>>>>                 return true;
>>>> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>>>>   */
>>>>  bool ns_capable(struct user_namespace *ns, int cap)
>>>>  {
>>>> -       return ns_capable_common(ns, cap, true);
>>>> +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
>>>>  }
>>>>  EXPORT_SYMBOL(ns_capable);
>>>>
>>>> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
>>>>   */
>>>>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>>>>  {
>>>> -       return ns_capable_common(ns, cap, false);
>>>> +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
>>>>  }
>>>>  EXPORT_SYMBOL(ns_capable_noaudit);
>>>>
>>>> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
>>>>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>>>>                      int cap)
>>>>  {
>>>> +
>>>>         if (WARN_ON_ONCE(!cap_valid(cap)))
>>>>                 return false;
>>>>
>>>> -       if (security_capable(file->f_cred, ns, cap) == 0)
>>>> +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
>>>>                 return true;
>>>>
>>>>         return false;
>>>> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>>>>  {
>>>>         int ret = 0;  /* An absent tracer adds no restrictions */
>>>>         const struct cred *cred;
>>>> +
>>>>         rcu_read_lock();
>>>>         cred = rcu_dereference(tsk->ptracer_cred);
>>>>         if (cred)
>>>> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
>>>> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
>>>> +                                      SECURITY_CAP_NOAUDIT);
>>>>         rcu_read_unlock();
>>>>         return (ret == 0);
>>>>  }
>>>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>>>> index f2ae2324c232..ddf615eb1bf7 100644
>>>> --- a/kernel/seccomp.c
>>>> +++ b/kernel/seccomp.c
>>>> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>>>          * behavior of privileged children.
>>>>          */
>>>>         if (!task_no_new_privs(current) &&
>>>> -           security_capable_noaudit(current_cred(), current_user_ns(),
>>>> -                                    CAP_SYS_ADMIN) != 0)
>>>> +           security_capable(current_cred(), current_user_ns(),
>>>> +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
>>>>                 return ERR_PTR(-EACCES);
>>>>
>>>>         /* Allocate a new seccomp_filter */
>>>> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
>>>> index 253ef6e9d445..0f6dca54b66e 100644
>>>> --- a/security/apparmor/capability.c
>>>> +++ b/security/apparmor/capability.c
>>>> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
>>>>   * profile_capable - test if profile allows use of capability @cap
>>>>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
>>>>   * @cap: capability to test if allowed
>>>> - * @audit: whether an audit record should be generated
>>>> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>>>>   * @sa: audit data (MAY BE NULL indicating no auditing)
>>>>   *
>>>>   * Returns: 0 if allowed else -EPERM
>>>>   */
>>>> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
>>>> -                          struct common_audit_data *sa)
>>>> +static int profile_capable(struct aa_profile *profile, int cap,
>>>> +                          unsigned int opts, struct common_audit_data *sa)
>>>>  {
>>>>         int error;
>>>>
>>>> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>>>>         else
>>>>                 error = -EPERM;
>>>>
>>>> -       if (audit == SECURITY_CAP_NOAUDIT) {
>>>> +       if (opts & SECURITY_CAP_NOAUDIT) {
>>>>                 if (!COMPLAIN_MODE(profile))
>>>>                         return error;
>>>>                 /* audit the cap request in complain mode but note that it
>>>> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>>>>   * aa_capable - test permission to use capability
>>>>   * @label: label being tested for capability (NOT NULL)
>>>>   * @cap: capability to be tested
>>>> - * @audit: whether an audit record should be generated
>>>> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>>>>   *
>>>>   * Look up capability in profile capability set.
>>>>   *
>>>>   * Returns: 0 on success, or else an error code.
>>>>   */
>>>> -int aa_capable(struct aa_label *label, int cap, int audit)
>>>> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
>>>>  {
>>>>         struct aa_profile *profile;
>>>>         int error = 0;
>>>> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
>>>>
>>>>         sa.u.cap = cap;
>>>>         error = fn_for_each_confined(label, profile,
>>>> -                       profile_capable(profile, cap, audit, &sa));
>>>> +                       profile_capable(profile, cap, opts, &sa));
>>>>
>>>>         return error;
>>>>  }
>>>> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
>>>> index e0304e2aeb7f..1b3663b6ab12 100644
>>>> --- a/security/apparmor/include/capability.h
>>>> +++ b/security/apparmor/include/capability.h
>>>> @@ -40,7 +40,7 @@ struct aa_caps {
>>>>
>>>>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
>>>>
>>>> -int aa_capable(struct aa_label *label, int cap, int audit);
>>>> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
>>>>
>>>>  static inline void aa_free_cap_rules(struct aa_caps *caps)
>>>>  {
>>>> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
>>>> index 527ea1557120..4a1da2313162 100644
>>>> --- a/security/apparmor/ipc.c
>>>> +++ b/security/apparmor/ipc.c
>>>> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
>>>>         aad(sa)->label = &tracer->label;
>>>>         aad(sa)->peer = tracee;
>>>>         aad(sa)->request = 0;
>>>> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
>>>> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
>>>> +                                   SECURITY_CAP_DEFAULT);
>>>>
>>>>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
>>>>  }
>>>> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
>>>> index 42446a216f3b..0bd817084fc1 100644
>>>> --- a/security/apparmor/lsm.c
>>>> +++ b/security/apparmor/lsm.c
>>>> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>>>>  }
>>>>
>>>>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
>>>> -                           int cap, int audit)
>>>> +                           int cap, unsigned int opts)
>>>>  {
>>>>         struct aa_label *label;
>>>>         int error = 0;
>>>>
>>>>         label = aa_get_newest_cred_label(cred);
>>>>         if (!unconfined(label))
>>>> -               error = aa_capable(label, cap, audit);
>>>> +               error = aa_capable(label, cap, opts);
>>>>         aa_put_label(label);
>>>>
>>>>         return error;
>>>> diff --git a/security/commoncap.c b/security/commoncap.c
>>>> index 232db019f051..3d8609192e17 100644
>>>> --- a/security/commoncap.c
>>>> +++ b/security/commoncap.c
>>>> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>>>>   * kernel's capable() and has_capability() returns 1 for this case.
>>>>   */
>>>>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
>>>> -               int cap, int audit)
>>>> +               int cap, unsigned int opts)
>>>>  {
>>>>         struct user_namespace *ns = targ_ns;
>>>>
>>>> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>>>>   */
>>>>  static inline int cap_inh_is_capped(void)
>>>>  {
>>>> -
>>>>         /* they are so limited unless the current task has the CAP_SETPCAP
>>>>          * capability
>>>>          */
>>>>         if (cap_capable(current_cred(), current_cred()->user_ns,
>>>> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
>>>> +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
>>>>                 return 0;
>>>>         return 1;
>>>>  }
>>>> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>>>>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>>>>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>>>>                     || (cap_capable(current_cred(),
>>>> -                                   current_cred()->user_ns, CAP_SETPCAP,
>>>> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
>>>> +                                   current_cred()->user_ns,
>>>> +                                   CAP_SETPCAP,
>>>> +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
>>>>                         /*
>>>>                          * [1] no changing of bits that are locked
>>>>                          * [2] no unlocking of locks
>>>> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>>>>  {
>>>>         int cap_sys_admin = 0;
>>>>
>>>> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
>>>> -                       SECURITY_CAP_NOAUDIT) == 0)
>>>> +       if (cap_capable(current_cred(), &init_user_ns,
>>>> +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
>>>>                 cap_sys_admin = 1;
>>>> +
>>>>         return cap_sys_admin;
>>>>  }
>>>>
>>>> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
>>>>
>>>>         if (addr < dac_mmap_min_addr) {
>>>>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
>>>> -                                 SECURITY_CAP_AUDIT);
>>>> +                                 SECURITY_CAP_DEFAULT);
>>>>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>>>>                 if (ret == 0)
>>>>                         current->flags |= PF_SUPERPRIV;
>>>> diff --git a/security/security.c b/security/security.c
>>>> index d670136dda2c..d2334697797a 100644
>>>> --- a/security/security.c
>>>> +++ b/security/security.c
>>>> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>>>>                                 effective, inheritable, permitted);
>>>>  }
>>>>
>>>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
>>>> -                    int cap)
>>>> +int security_capable(const struct cred *cred,
>>>> +                    struct user_namespace *ns,
>>>> +                    int cap,
>>>> +                    unsigned int opts)
>>>>  {
>>>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
>>>> -}
>>>> -
>>>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
>>>> -                            int cap)
>>>> -{
>>>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
>>>> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>>>>  }
>>>>
>>>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
>>>> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
>>>> index a67459eb62d5..a4b2e49213de 100644
>>>> --- a/security/selinux/hooks.c
>>>> +++ b/security/selinux/hooks.c
>>>> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
>>>>
>>>>  /* Check whether a task is allowed to use a capability. */
>>>>  static int cred_has_capability(const struct cred *cred,
>>>> -                              int cap, int audit, bool initns)
>>>> +                              int cap, unsigned int opts, bool initns)
>>>>  {
>>>>         struct common_audit_data ad;
>>>>         struct av_decision avd;
>>>> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
>>>>
>>>>         rc = avc_has_perm_noaudit(&selinux_state,
>>>>                                   sid, sid, sclass, av, 0, &avd);
>>>> -       if (audit == SECURITY_CAP_AUDIT) {
>>>> +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
>>>>                 int rc2 = avc_audit(&selinux_state,
>>>>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
>>>>                 if (rc2)
>>>> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>>>>   */
>>>>
>>>>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
>>>> -                          int cap, int audit)
>>>> +                          int cap, unsigned int opts)
>>>>  {
>>>> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
>>>> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
>>>>  }
>>>>
>>>>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
>>>> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
>>>>  static bool has_cap_mac_admin(bool audit)
>>>>  {
>>>>         const struct cred *cred = current_cred();
>>>> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
>>>> +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
>>>>
>>>> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
>>>> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
>>>>                 return false;
>>>> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
>>>> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
>>>>                 return false;
>>>>         return true;
>>>>  }
>>>> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
>>>>         case KDSKBENT:
>>>>         case KDSKBSENT:
>>>>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
>>>> -                                           SECURITY_CAP_AUDIT, true);
>>>> +                                           SECURITY_CAP_DEFAULT, true);
>>>>                 break;
>>>>
>>>>         /* default case assumes that the command will go
>>>> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
>>>> index 9a4c0ad46518..fac2a21aa7d4 100644
>>>> --- a/security/smack/smack_access.c
>>>> +++ b/security/smack/smack_access.c
>>>> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>>>>         struct smack_known_list_elem *sklep;
>>>>         int rc;
>>>>
>>>> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
>>>> +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
>>>>         if (rc)
>>>>                 return false;
>>>>
>>>> --
>>>> 2.20.0.405.gbc1bbc6f85-goog
>>>>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2019-01-07 18:46                 ` Casey Schaufler
@ 2019-01-07 19:02                   ` Micah Morton
  2019-01-07 22:57                     ` [PATCH v3] " mortonm
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-07 19:02 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: jmorris, serge, Kees Cook, sds, linux-security-module

Looks like kernel/seccomp.c is the only file that would escape
modification if we kept security_capable_noaudit, since the other
files where we modify security_capable_noaudit require changes to
security_capable as well to pass the flag -- so we'll be changing them
anyway.

On Mon, Jan 7, 2019 at 10:46 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 1/7/2019 10:36 AM, Micah Morton wrote:
> > It seems a bit weird to me to keep security_capable_noaudit and not
> > add the analogous "security_capable_insetid" function (or other
> > one-off functions if/when people want to pass new flags to
> > security_capable). Taking away the function doesn't complicate the
> > callers in any way I can see, and somewhat cleans up the logic in at
> > lease one case (ns_capable_common in kernel/capability.c) since
> > callers can just modify the last param in security_capable rather than
> > calling different functions for audit vs. noaudit. I guess my take is
> > why keep "security_capable_noaudit" when it is easy to just call
> > "security_capable" with the SECURITY_CAP_NOAUDIT flag? I have no
> > strong preference here so I'll do whatever seems best.
>
> My only reason to suggest keeping the function is to reduce
> code churn. I would think that whoever introduced the noaudit
> version had a reason to do that. It probably isn't a big deal.
> I don't have a lot of energy on the issue, but it would make
> your patch a bit smaller, and impact a lot fewer files.
>
> >
> > On Mon, Jan 7, 2019 at 10:16 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 1/7/2019 9:55 AM, Micah Morton wrote:
> >>> Checking in to see if there are any further comments on this patch now
> >>> that the holidays are passed? It seems like a straightforward change
> >>> to me, but let me know if there is anything I can clarify that isn't
> >>> explained by the commit message.
> >>>
> >>> On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
> >>>> From: Micah Morton <mortonm@chromium.org>
> >>>>
> >>>> This patch provides a general mechanism for passing flags to the
> >>>> security_capable LSM hook. It replaces the specific 'audit' flag that is
> >>>> used to tell security_capable whether it should log an audit message for
> >>>> the given capability check. The reason for generalizing this flag
> >>>> passing is so we can add an additional flag that signifies whether
> >>>> security_capable is being called by a setid syscall (which is needed by
> >>>> the proposed SafeSetID LSM).
> >>>>
> >>>> Signed-off-by: Micah Morton <mortonm@chromium.org>
> >>>> ---
> >>>> Changes since the last patch: Changed the code to use a bitmask instead
> >>>> of a struct to represent the options passed to security_capable.
> >>>>
> >>>>  include/linux/lsm_hooks.h              |  8 +++++---
> >>>>  include/linux/security.h               | 28 +++++++++++++-------------
> >>>>  kernel/capability.c                    | 22 +++++++++++---------
> >>>>  kernel/seccomp.c                       |  4 ++--
> >>>>  security/apparmor/capability.c         | 14 ++++++-------
> >>>>  security/apparmor/include/capability.h |  2 +-
> >>>>  security/apparmor/ipc.c                |  3 ++-
> >>>>  security/apparmor/lsm.c                |  4 ++--
> >>>>  security/commoncap.c                   | 17 ++++++++--------
> >>>>  security/security.c                    | 14 +++++--------
> >>>>  security/selinux/hooks.c               | 16 +++++++--------
> >>>>  security/smack/smack_access.c          |  2 +-
> >>>>  12 files changed, 69 insertions(+), 65 deletions(-)
> >>>>
> >>>> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> >>>> index aaeb7fa24dc4..ef955a44a782 100644
> >>>> --- a/include/linux/lsm_hooks.h
> >>>> +++ b/include/linux/lsm_hooks.h
> >>>> @@ -1270,7 +1270,7 @@
> >>>>   *     @cred contains the credentials to use.
> >>>>   *     @ns contains the user namespace we want the capability in
> >>>>   *     @cap contains the capability <include/linux/capability.h>.
> >>>> - *     @audit contains whether to write an audit message or not
> >>>> + *     @opts contains options for the capable check <include/linux/security.h>
> >>>>   *     Return 0 if the capability is granted for @tsk.
> >>>>   * @syslog:
> >>>>   *     Check permission before accessing the kernel message ring or changing
> >>>> @@ -1446,8 +1446,10 @@ union security_list_options {
> >>>>                         const kernel_cap_t *effective,
> >>>>                         const kernel_cap_t *inheritable,
> >>>>                         const kernel_cap_t *permitted);
> >>>> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> >>>> -                       int cap, int audit);
> >>>> +       int (*capable)(const struct cred *cred,
> >>>> +                       struct user_namespace *ns,
> >>>> +                       int cap,
> >>>> +                       unsigned int opts);
> >>>>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
> >>>>         int (*quota_on)(struct dentry *dentry);
> >>>>         int (*syslog)(int type);
> >>>> diff --git a/include/linux/security.h b/include/linux/security.h
> >>>> index d170a5b031f3..038e6779948c 100644
> >>>> --- a/include/linux/security.h
> >>>> +++ b/include/linux/security.h
> >>>> @@ -54,9 +54,12 @@ struct xattr;
> >>>>  struct xfrm_sec_ctx;
> >>>>  struct mm_struct;
> >>>>
> >>>> +/* Default (no) options for the capable function */
> >>>> +#define SECURITY_CAP_DEFAULT 0x0
> >>>>  /* If capable should audit the security request */
> >>>> -#define SECURITY_CAP_NOAUDIT 0
> >>>> -#define SECURITY_CAP_AUDIT 1
> >>>> +#define SECURITY_CAP_NOAUDIT 0x01
> >>>> +/* If capable is being called by a setid function */
> >>>> +#define SECURITY_CAP_INSETID 0x02
> >>>>
> >>>>  /* LSM Agnostic defines for sb_set_mnt_opts */
> >>>>  #define SECURITY_LSM_NATIVE_LABELS     1
> >>>> @@ -72,7 +75,7 @@ enum lsm_event {
> >>>>
> >>>>  /* These functions are in security/commoncap.c */
> >>>>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> >>>> -                      int cap, int audit);
> >>>> +                      int cap, unsigned int opts);
> >>>>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
> >>>>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
> >>>>  extern int cap_ptrace_traceme(struct task_struct *parent);
> >>>> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
> >>>>                     const kernel_cap_t *effective,
> >>>>                     const kernel_cap_t *inheritable,
> >>>>                     const kernel_cap_t *permitted);
> >>>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> >>>> -                       int cap);
> >>>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> >>>> -                            int cap);
> >>>> +int security_capable(const struct cred *cred,
> >>>> +                      struct user_namespace *ns,
> >>>> +                      int cap,
> >>>> +                      unsigned int opts);
> >>>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
> >>>>  int security_quota_on(struct dentry *dentry);
> >>>>  int security_syslog(int type);
> >>>> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
> >>>>  }
> >>>>
> >>>>  static inline int security_capable(const struct cred *cred,
> >>>> -                                  struct user_namespace *ns, int cap)
> >>>> +                                  struct user_namespace *ns,
> >>>> +                                  int cap,
> >>>> +                                  unsigned int opts)
> >>>>  {
> >>>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> >>>> -}
> >>>> -
> >>>> -static inline int security_capable_noaudit(const struct cred *cred,
> >>>> -                                          struct user_namespace *ns, int cap) {
> >>>> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> >>>> +       return cap_capable(cred, ns, cap, opts);
> >>>>  }
> >> Why get rid of security_capable_noaudit()?
> >>
> >>>>  static inline int security_quotactl(int cmds, int type, int id,
> >>>> diff --git a/kernel/capability.c b/kernel/capability.c
> >>>> index 1e1c0236f55b..454576743b1b 100644
> >>>> --- a/kernel/capability.c
> >>>> +++ b/kernel/capability.c
> >>>> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
> >>>>         int ret;
> >>>>
> >>>>         rcu_read_lock();
> >>>> -       ret = security_capable(__task_cred(t), ns, cap);
> >>>> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
> >>>>         rcu_read_unlock();
> >>>>
> >>>>         return (ret == 0);
> >>>> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
> >>>>         int ret;
> >>>>
> >>>>         rcu_read_lock();
> >>>> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> >>>> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
> >>>>         rcu_read_unlock();
> >>>>
> >>>>         return (ret == 0);
> >>>> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
> >>>>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
> >>>>  }
> >>>>
> >>>> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>>> +static bool ns_capable_common(struct user_namespace *ns,
> >>>> +                             int cap,
> >>>> +                             unsigned int opts)
> >>>>  {
> >>>>         int capable;
> >>>>
> >>>> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>>>                 BUG();
> >>>>         }
> >>>>
> >>>> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> >>>> -                         security_capable_noaudit(current_cred(), ns, cap);
> >>>> +       capable = security_capable(current_cred(), ns, cap, opts);
> >>>>         if (capable == 0) {
> >>>>                 current->flags |= PF_SUPERPRIV;
> >>>>                 return true;
> >>>> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >>>>   */
> >>>>  bool ns_capable(struct user_namespace *ns, int cap)
> >>>>  {
> >>>> -       return ns_capable_common(ns, cap, true);
> >>>> +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
> >>>>  }
> >>>>  EXPORT_SYMBOL(ns_capable);
> >>>>
> >>>> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
> >>>>   */
> >>>>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> >>>>  {
> >>>> -       return ns_capable_common(ns, cap, false);
> >>>> +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
> >>>>  }
> >>>>  EXPORT_SYMBOL(ns_capable_noaudit);
> >>>>
> >>>> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
> >>>>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
> >>>>                      int cap)
> >>>>  {
> >>>> +
> >>>>         if (WARN_ON_ONCE(!cap_valid(cap)))
> >>>>                 return false;
> >>>>
> >>>> -       if (security_capable(file->f_cred, ns, cap) == 0)
> >>>> +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
> >>>>                 return true;
> >>>>
> >>>>         return false;
> >>>> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
> >>>>  {
> >>>>         int ret = 0;  /* An absent tracer adds no restrictions */
> >>>>         const struct cred *cred;
> >>>> +
> >>>>         rcu_read_lock();
> >>>>         cred = rcu_dereference(tsk->ptracer_cred);
> >>>>         if (cred)
> >>>> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> >>>> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> >>>> +                                      SECURITY_CAP_NOAUDIT);
> >>>>         rcu_read_unlock();
> >>>>         return (ret == 0);
> >>>>  }
> >>>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> >>>> index f2ae2324c232..ddf615eb1bf7 100644
> >>>> --- a/kernel/seccomp.c
> >>>> +++ b/kernel/seccomp.c
> >>>> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >>>>          * behavior of privileged children.
> >>>>          */
> >>>>         if (!task_no_new_privs(current) &&
> >>>> -           security_capable_noaudit(current_cred(), current_user_ns(),
> >>>> -                                    CAP_SYS_ADMIN) != 0)
> >>>> +           security_capable(current_cred(), current_user_ns(),
> >>>> +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
> >>>>                 return ERR_PTR(-EACCES);
> >>>>
> >>>>         /* Allocate a new seccomp_filter */
> >>>> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> >>>> index 253ef6e9d445..0f6dca54b66e 100644
> >>>> --- a/security/apparmor/capability.c
> >>>> +++ b/security/apparmor/capability.c
> >>>> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
> >>>>   * profile_capable - test if profile allows use of capability @cap
> >>>>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
> >>>>   * @cap: capability to test if allowed
> >>>> - * @audit: whether an audit record should be generated
> >>>> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
> >>>>   * @sa: audit data (MAY BE NULL indicating no auditing)
> >>>>   *
> >>>>   * Returns: 0 if allowed else -EPERM
> >>>>   */
> >>>> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >>>> -                          struct common_audit_data *sa)
> >>>> +static int profile_capable(struct aa_profile *profile, int cap,
> >>>> +                          unsigned int opts, struct common_audit_data *sa)
> >>>>  {
> >>>>         int error;
> >>>>
> >>>> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >>>>         else
> >>>>                 error = -EPERM;
> >>>>
> >>>> -       if (audit == SECURITY_CAP_NOAUDIT) {
> >>>> +       if (opts & SECURITY_CAP_NOAUDIT) {
> >>>>                 if (!COMPLAIN_MODE(profile))
> >>>>                         return error;
> >>>>                 /* audit the cap request in complain mode but note that it
> >>>> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >>>>   * aa_capable - test permission to use capability
> >>>>   * @label: label being tested for capability (NOT NULL)
> >>>>   * @cap: capability to be tested
> >>>> - * @audit: whether an audit record should be generated
> >>>> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
> >>>>   *
> >>>>   * Look up capability in profile capability set.
> >>>>   *
> >>>>   * Returns: 0 on success, or else an error code.
> >>>>   */
> >>>> -int aa_capable(struct aa_label *label, int cap, int audit)
> >>>> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
> >>>>  {
> >>>>         struct aa_profile *profile;
> >>>>         int error = 0;
> >>>> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
> >>>>
> >>>>         sa.u.cap = cap;
> >>>>         error = fn_for_each_confined(label, profile,
> >>>> -                       profile_capable(profile, cap, audit, &sa));
> >>>> +                       profile_capable(profile, cap, opts, &sa));
> >>>>
> >>>>         return error;
> >>>>  }
> >>>> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> >>>> index e0304e2aeb7f..1b3663b6ab12 100644
> >>>> --- a/security/apparmor/include/capability.h
> >>>> +++ b/security/apparmor/include/capability.h
> >>>> @@ -40,7 +40,7 @@ struct aa_caps {
> >>>>
> >>>>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
> >>>>
> >>>> -int aa_capable(struct aa_label *label, int cap, int audit);
> >>>> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
> >>>>
> >>>>  static inline void aa_free_cap_rules(struct aa_caps *caps)
> >>>>  {
> >>>> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> >>>> index 527ea1557120..4a1da2313162 100644
> >>>> --- a/security/apparmor/ipc.c
> >>>> +++ b/security/apparmor/ipc.c
> >>>> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
> >>>>         aad(sa)->label = &tracer->label;
> >>>>         aad(sa)->peer = tracee;
> >>>>         aad(sa)->request = 0;
> >>>> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> >>>> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> >>>> +                                   SECURITY_CAP_DEFAULT);
> >>>>
> >>>>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
> >>>>  }
> >>>> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> >>>> index 42446a216f3b..0bd817084fc1 100644
> >>>> --- a/security/apparmor/lsm.c
> >>>> +++ b/security/apparmor/lsm.c
> >>>> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
> >>>>  }
> >>>>
> >>>>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> >>>> -                           int cap, int audit)
> >>>> +                           int cap, unsigned int opts)
> >>>>  {
> >>>>         struct aa_label *label;
> >>>>         int error = 0;
> >>>>
> >>>>         label = aa_get_newest_cred_label(cred);
> >>>>         if (!unconfined(label))
> >>>> -               error = aa_capable(label, cap, audit);
> >>>> +               error = aa_capable(label, cap, opts);
> >>>>         aa_put_label(label);
> >>>>
> >>>>         return error;
> >>>> diff --git a/security/commoncap.c b/security/commoncap.c
> >>>> index 232db019f051..3d8609192e17 100644
> >>>> --- a/security/commoncap.c
> >>>> +++ b/security/commoncap.c
> >>>> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
> >>>>   * kernel's capable() and has_capability() returns 1 for this case.
> >>>>   */
> >>>>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> >>>> -               int cap, int audit)
> >>>> +               int cap, unsigned int opts)
> >>>>  {
> >>>>         struct user_namespace *ns = targ_ns;
> >>>>
> >>>> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
> >>>>   */
> >>>>  static inline int cap_inh_is_capped(void)
> >>>>  {
> >>>> -
> >>>>         /* they are so limited unless the current task has the CAP_SETPCAP
> >>>>          * capability
> >>>>          */
> >>>>         if (cap_capable(current_cred(), current_cred()->user_ns,
> >>>> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> >>>> +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
> >>>>                 return 0;
> >>>>         return 1;
> >>>>  }
> >>>> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >>>>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
> >>>>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
> >>>>                     || (cap_capable(current_cred(),
> >>>> -                                   current_cred()->user_ns, CAP_SETPCAP,
> >>>> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> >>>> +                                   current_cred()->user_ns,
> >>>> +                                   CAP_SETPCAP,
> >>>> +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
> >>>>                         /*
> >>>>                          * [1] no changing of bits that are locked
> >>>>                          * [2] no unlocking of locks
> >>>> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
> >>>>  {
> >>>>         int cap_sys_admin = 0;
> >>>>
> >>>> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> >>>> -                       SECURITY_CAP_NOAUDIT) == 0)
> >>>> +       if (cap_capable(current_cred(), &init_user_ns,
> >>>> +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
> >>>>                 cap_sys_admin = 1;
> >>>> +
> >>>>         return cap_sys_admin;
> >>>>  }
> >>>>
> >>>> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
> >>>>
> >>>>         if (addr < dac_mmap_min_addr) {
> >>>>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> >>>> -                                 SECURITY_CAP_AUDIT);
> >>>> +                                 SECURITY_CAP_DEFAULT);
> >>>>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
> >>>>                 if (ret == 0)
> >>>>                         current->flags |= PF_SUPERPRIV;
> >>>> diff --git a/security/security.c b/security/security.c
> >>>> index d670136dda2c..d2334697797a 100644
> >>>> --- a/security/security.c
> >>>> +++ b/security/security.c
> >>>> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
> >>>>                                 effective, inheritable, permitted);
> >>>>  }
> >>>>
> >>>> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> >>>> -                    int cap)
> >>>> +int security_capable(const struct cred *cred,
> >>>> +                    struct user_namespace *ns,
> >>>> +                    int cap,
> >>>> +                    unsigned int opts)
> >>>>  {
> >>>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> >>>> -}
> >>>> -
> >>>> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> >>>> -                            int cap)
> >>>> -{
> >>>> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> >>>> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
> >>>>  }
> >>>>
> >>>>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> >>>> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> >>>> index a67459eb62d5..a4b2e49213de 100644
> >>>> --- a/security/selinux/hooks.c
> >>>> +++ b/security/selinux/hooks.c
> >>>> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
> >>>>
> >>>>  /* Check whether a task is allowed to use a capability. */
> >>>>  static int cred_has_capability(const struct cred *cred,
> >>>> -                              int cap, int audit, bool initns)
> >>>> +                              int cap, unsigned int opts, bool initns)
> >>>>  {
> >>>>         struct common_audit_data ad;
> >>>>         struct av_decision avd;
> >>>> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
> >>>>
> >>>>         rc = avc_has_perm_noaudit(&selinux_state,
> >>>>                                   sid, sid, sclass, av, 0, &avd);
> >>>> -       if (audit == SECURITY_CAP_AUDIT) {
> >>>> +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
> >>>>                 int rc2 = avc_audit(&selinux_state,
> >>>>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
> >>>>                 if (rc2)
> >>>> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
> >>>>   */
> >>>>
> >>>>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> >>>> -                          int cap, int audit)
> >>>> +                          int cap, unsigned int opts)
> >>>>  {
> >>>> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> >>>> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
> >>>>  }
> >>>>
> >>>>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> >>>> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
> >>>>  static bool has_cap_mac_admin(bool audit)
> >>>>  {
> >>>>         const struct cred *cred = current_cred();
> >>>> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> >>>> +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
> >>>>
> >>>> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> >>>> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
> >>>>                 return false;
> >>>> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> >>>> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
> >>>>                 return false;
> >>>>         return true;
> >>>>  }
> >>>> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
> >>>>         case KDSKBENT:
> >>>>         case KDSKBSENT:
> >>>>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> >>>> -                                           SECURITY_CAP_AUDIT, true);
> >>>> +                                           SECURITY_CAP_DEFAULT, true);
> >>>>                 break;
> >>>>
> >>>>         /* default case assumes that the command will go
> >>>> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> >>>> index 9a4c0ad46518..fac2a21aa7d4 100644
> >>>> --- a/security/smack/smack_access.c
> >>>> +++ b/security/smack/smack_access.c
> >>>> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
> >>>>         struct smack_known_list_elem *sklep;
> >>>>         int rc;
> >>>>
> >>>> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> >>>> +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
> >>>>         if (rc)
> >>>>                 return false;
> >>>>
> >>>> --
> >>>> 2.20.0.405.gbc1bbc6f85-goog
> >>>>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3] LSM: generalize flag passing to security_capable
  2019-01-07 19:02                   ` Micah Morton
@ 2019-01-07 22:57                     ` mortonm
  0 siblings, 0 replies; 88+ messages in thread
From: mortonm @ 2019-01-07 22:57 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM).

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Changes since the last patch: Changed the code to not remove the
security_capable_noaudit function. To be clear, I prefer the v2 patch
set (that indeed removes the function), since it is straightforward to
call security_capable with the SECURITY_CAP_NOAUDIT flag rather than
calling security_capable_noaudit (and we don't really save any code
churn by keeping the security_capable_noaudit function in). In any
event, I'm including this version of the patch for completeness in the
event people prefer leaving the function in the code.

 include/linux/lsm_hooks.h              |  8 +++++---
 include/linux/security.h               | 21 ++++++++++++++-------
 kernel/capability.c                    | 17 ++++++++++-------
 security/apparmor/capability.c         | 14 +++++++-------
 security/apparmor/include/capability.h |  2 +-
 security/apparmor/ipc.c                |  3 ++-
 security/apparmor/lsm.c                |  4 ++--
 security/commoncap.c                   | 17 +++++++++--------
 security/security.c                    |  8 +++++---
 security/selinux/hooks.c               | 16 ++++++++--------
 security/smack/smack_access.c          |  2 +-
 11 files changed, 64 insertions(+), 48 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index aaeb7fa24dc4..ef955a44a782 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1270,7 +1270,7 @@
  *	@cred contains the credentials to use.
  *	@ns contains the user namespace we want the capability in
  *	@cap contains the capability <include/linux/capability.h>.
- *	@audit contains whether to write an audit message or not
+ *	@opts contains options for the capable check <include/linux/security.h>
  *	Return 0 if the capability is granted for @tsk.
  * @syslog:
  *	Check permission before accessing the kernel message ring or changing
@@ -1446,8 +1446,10 @@ union security_list_options {
 			const kernel_cap_t *effective,
 			const kernel_cap_t *inheritable,
 			const kernel_cap_t *permitted);
-	int (*capable)(const struct cred *cred, struct user_namespace *ns,
-			int cap, int audit);
+	int (*capable)(const struct cred *cred,
+			struct user_namespace *ns,
+			int cap,
+			unsigned int opts);
 	int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
 	int (*quota_on)(struct dentry *dentry);
 	int (*syslog)(int type);
diff --git a/include/linux/security.h b/include/linux/security.h
index d170a5b031f3..468cdbf30a23 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -54,9 +54,12 @@ struct xattr;
 struct xfrm_sec_ctx;
 struct mm_struct;
 
+/* Default (no) options for the capable function */
+#define SECURITY_CAP_DEFAULT 0x0
 /* If capable should audit the security request */
-#define SECURITY_CAP_NOAUDIT 0
-#define SECURITY_CAP_AUDIT 1
+#define SECURITY_CAP_NOAUDIT 0x01
+/* If capable is being called by a setid function */
+#define SECURITY_CAP_INSETID 0x02
 
 /* LSM Agnostic defines for sb_set_mnt_opts */
 #define SECURITY_LSM_NATIVE_LABELS	1
@@ -72,7 +75,7 @@ enum lsm_event {
 
 /* These functions are in security/commoncap.c */
 extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
-		       int cap, int audit);
+		       int cap, unsigned int opts);
 extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
 extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
@@ -233,8 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
 		    const kernel_cap_t *effective,
 		    const kernel_cap_t *inheritable,
 		    const kernel_cap_t *permitted);
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-			int cap);
+int security_capable(const struct cred *cred,
+		       struct user_namespace *ns,
+		       int cap,
+		       unsigned int opts);
 int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
 			     int cap);
 int security_quotactl(int cmds, int type, int id, struct super_block *sb);
@@ -492,9 +497,11 @@ static inline int security_capset(struct cred *new,
 }
 
 static inline int security_capable(const struct cred *cred,
-				   struct user_namespace *ns, int cap)
+				   struct user_namespace *ns,
+				   int cap,
+				   unsigned int opts)
 {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
+	return cap_capable(cred, ns, cap, opts);
 }
 
 static inline int security_capable_noaudit(const struct cred *cred,
diff --git a/kernel/capability.c b/kernel/capability.c
index 1e1c0236f55b..e697579ade8c 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
 	int ret;
 
 	rcu_read_lock();
-	ret = security_capable(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
 	return has_ns_capability_noaudit(t, &init_user_ns, cap);
 }
 
-static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
+static bool ns_capable_common(struct user_namespace *ns,
+			      int cap,
+			      unsigned int opts)
 {
 	int capable;
 
@@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
 		BUG();
 	}
 
-	capable = audit ? security_capable(current_cred(), ns, cap) :
-			  security_capable_noaudit(current_cred(), ns, cap);
+	capable = security_capable(current_cred(), ns, cap, opts);
 	if (capable == 0) {
 		current->flags |= PF_SUPERPRIV;
 		return true;
@@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
  */
 bool ns_capable(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, true);
+	return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
 }
 EXPORT_SYMBOL(ns_capable);
 
@@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
  */
 bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, false);
+	return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
@@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
 bool file_ns_capable(const struct file *file, struct user_namespace *ns,
 		     int cap)
 {
+
 	if (WARN_ON_ONCE(!cap_valid(cap)))
 		return false;
 
-	if (security_capable(file->f_cred, ns, cap) == 0)
+	if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
 		return true;
 
 	return false;
@@ -500,6 +502,7 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
 {
 	int ret = 0;  /* An absent tracer adds no restrictions */
 	const struct cred *cred;
+
 	rcu_read_lock();
 	cred = rcu_dereference(tsk->ptracer_cred);
 	if (cred)
diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 253ef6e9d445..0f6dca54b66e 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
  * profile_capable - test if profile allows use of capability @cap
  * @profile: profile being enforced    (NOT NULL, NOT unconfined)
  * @cap: capability to test if allowed
- * @audit: whether an audit record should be generated
+ * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
  * @sa: audit data (MAY BE NULL indicating no auditing)
  *
  * Returns: 0 if allowed else -EPERM
  */
-static int profile_capable(struct aa_profile *profile, int cap, int audit,
-			   struct common_audit_data *sa)
+static int profile_capable(struct aa_profile *profile, int cap,
+			   unsigned int opts, struct common_audit_data *sa)
 {
 	int error;
 
@@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
 	else
 		error = -EPERM;
 
-	if (audit == SECURITY_CAP_NOAUDIT) {
+	if (opts & SECURITY_CAP_NOAUDIT) {
 		if (!COMPLAIN_MODE(profile))
 			return error;
 		/* audit the cap request in complain mode but note that it
@@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
  * aa_capable - test permission to use capability
  * @label: label being tested for capability (NOT NULL)
  * @cap: capability to be tested
- * @audit: whether an audit record should be generated
+ * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
  *
  * Look up capability in profile capability set.
  *
  * Returns: 0 on success, or else an error code.
  */
-int aa_capable(struct aa_label *label, int cap, int audit)
+int aa_capable(struct aa_label *label, int cap, unsigned int opts)
 {
 	struct aa_profile *profile;
 	int error = 0;
@@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
 
 	sa.u.cap = cap;
 	error = fn_for_each_confined(label, profile,
-			profile_capable(profile, cap, audit, &sa));
+			profile_capable(profile, cap, opts, &sa));
 
 	return error;
 }
diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
index e0304e2aeb7f..1b3663b6ab12 100644
--- a/security/apparmor/include/capability.h
+++ b/security/apparmor/include/capability.h
@@ -40,7 +40,7 @@ struct aa_caps {
 
 extern struct aa_sfs_entry aa_sfs_entry_caps[];
 
-int aa_capable(struct aa_label *label, int cap, int audit);
+int aa_capable(struct aa_label *label, int cap, unsigned int opts);
 
 static inline void aa_free_cap_rules(struct aa_caps *caps)
 {
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index 527ea1557120..4a1da2313162 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
 	aad(sa)->label = &tracer->label;
 	aad(sa)->peer = tracee;
 	aad(sa)->request = 0;
-	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
+	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
+				    SECURITY_CAP_DEFAULT);
 
 	return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
 }
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 42446a216f3b..0bd817084fc1 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
 }
 
 static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
-			    int cap, int audit)
+			    int cap, unsigned int opts)
 {
 	struct aa_label *label;
 	int error = 0;
 
 	label = aa_get_newest_cred_label(cred);
 	if (!unconfined(label))
-		error = aa_capable(label, cap, audit);
+		error = aa_capable(label, cap, opts);
 	aa_put_label(label);
 
 	return error;
diff --git a/security/commoncap.c b/security/commoncap.c
index 232db019f051..3d8609192e17 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
  * kernel's capable() and has_capability() returns 1 for this case.
  */
 int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
-		int cap, int audit)
+		int cap, unsigned int opts)
 {
 	struct user_namespace *ns = targ_ns;
 
@@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
  */
 static inline int cap_inh_is_capped(void)
 {
-
 	/* they are so limited unless the current task has the CAP_SETPCAP
 	 * capability
 	 */
 	if (cap_capable(current_cred(), current_cred()->user_ns,
-			CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
+			CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
 		return 0;
 	return 1;
 }
@@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 		    || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))	/*[2]*/
 		    || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))	/*[3]*/
 		    || (cap_capable(current_cred(),
-				    current_cred()->user_ns, CAP_SETPCAP,
-				    SECURITY_CAP_AUDIT) != 0)		/*[4]*/
+				    current_cred()->user_ns,
+				    CAP_SETPCAP,
+				    SECURITY_CAP_DEFAULT) != 0)		/*[4]*/
 			/*
 			 * [1] no changing of bits that are locked
 			 * [2] no unlocking of locks
@@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 {
 	int cap_sys_admin = 0;
 
-	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
-			SECURITY_CAP_NOAUDIT) == 0)
+	if (cap_capable(current_cred(), &init_user_ns,
+				CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
 		cap_sys_admin = 1;
+
 	return cap_sys_admin;
 }
 
@@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
 
 	if (addr < dac_mmap_min_addr) {
 		ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
-				  SECURITY_CAP_AUDIT);
+				  SECURITY_CAP_DEFAULT);
 		/* set PF_SUPERPRIV if it turns out we allow the low mmap */
 		if (ret == 0)
 			current->flags |= PF_SUPERPRIV;
diff --git a/security/security.c b/security/security.c
index d670136dda2c..050351cec339 100644
--- a/security/security.c
+++ b/security/security.c
@@ -294,10 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
 				effective, inheritable, permitted);
 }
 
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-		     int cap)
+int security_capable(const struct cred *cred,
+		     struct user_namespace *ns,
+		     int cap,
+		     unsigned int opts)
 {
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
+	return call_int_hook(capable, 0, cred, ns, cap, opts);
 }
 
 int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a67459eb62d5..a4b2e49213de 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
 
 /* Check whether a task is allowed to use a capability. */
 static int cred_has_capability(const struct cred *cred,
-			       int cap, int audit, bool initns)
+			       int cap, unsigned int opts, bool initns)
 {
 	struct common_audit_data ad;
 	struct av_decision avd;
@@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
 
 	rc = avc_has_perm_noaudit(&selinux_state,
 				  sid, sid, sclass, av, 0, &avd);
-	if (audit == SECURITY_CAP_AUDIT) {
+	if (!(opts & SECURITY_CAP_NOAUDIT)) {
 		int rc2 = avc_audit(&selinux_state,
 				    sid, sid, sclass, av, &avd, rc, &ad, 0);
 		if (rc2)
@@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
  */
 
 static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
-			   int cap, int audit)
+			   int cap, unsigned int opts)
 {
-	return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
+	return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
 }
 
 static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
@@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
 static bool has_cap_mac_admin(bool audit)
 {
 	const struct cred *cred = current_cred();
-	int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
+	unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
 
-	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
+	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
 		return false;
-	if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
+	if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
 		return false;
 	return true;
 }
@@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
 	case KDSKBENT:
 	case KDSKBSENT:
 		error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
-					    SECURITY_CAP_AUDIT, true);
+					    SECURITY_CAP_DEFAULT, true);
 		break;
 
 	/* default case assumes that the command will go
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 9a4c0ad46518..fac2a21aa7d4 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
 	struct smack_known_list_elem *sklep;
 	int rc;
 
-	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
+	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
 	if (rc)
 		return false;
 
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2018-12-18 22:37         ` [PATCH v2] " mortonm
  2019-01-07 17:55           ` Micah Morton
@ 2019-01-07 23:13           ` Kees Cook
  2019-01-08  0:10             ` [PATCH v4] " mortonm
  2019-01-08  0:10             ` [PATCH v2] " Micah Morton
  1 sibling, 2 replies; 88+ messages in thread
From: Kees Cook @ 2019-01-07 23:13 UTC (permalink / raw)
  To: Micah Morton, James Morris, Stephen Smalley
  Cc: Serge E. Hallyn, Casey Schaufler, linux-security-module

On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> This patch provides a general mechanism for passing flags to the
> security_capable LSM hook. It replaces the specific 'audit' flag that is
> used to tell security_capable whether it should log an audit message for
> the given capability check. The reason for generalizing this flag
> passing is so we can add an additional flag that signifies whether
> security_capable is being called by a setid syscall (which is needed by
> the proposed SafeSetID LSM).
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> Changes since the last patch: Changed the code to use a bitmask instead
> of a struct to represent the options passed to security_capable.

FWIW, I too prefer this v2 patch. I don't see a reason to keep an
"option-ified" function around if it's been generalized into a
bitfield argument.

>  include/linux/lsm_hooks.h              |  8 +++++---
>  include/linux/security.h               | 28 +++++++++++++-------------
>  kernel/capability.c                    | 22 +++++++++++---------
>  kernel/seccomp.c                       |  4 ++--
>  security/apparmor/capability.c         | 14 ++++++-------
>  security/apparmor/include/capability.h |  2 +-
>  security/apparmor/ipc.c                |  3 ++-
>  security/apparmor/lsm.c                |  4 ++--
>  security/commoncap.c                   | 17 ++++++++--------
>  security/security.c                    | 14 +++++--------
>  security/selinux/hooks.c               | 16 +++++++--------
>  security/smack/smack_access.c          |  2 +-
>  12 files changed, 69 insertions(+), 65 deletions(-)
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index aaeb7fa24dc4..ef955a44a782 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1270,7 +1270,7 @@
>   *     @cred contains the credentials to use.
>   *     @ns contains the user namespace we want the capability in
>   *     @cap contains the capability <include/linux/capability.h>.
> - *     @audit contains whether to write an audit message or not
> + *     @opts contains options for the capable check <include/linux/security.h>
>   *     Return 0 if the capability is granted for @tsk.
>   * @syslog:
>   *     Check permission before accessing the kernel message ring or changing
> @@ -1446,8 +1446,10 @@ union security_list_options {
>                         const kernel_cap_t *effective,
>                         const kernel_cap_t *inheritable,
>                         const kernel_cap_t *permitted);
> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> -                       int cap, int audit);
> +       int (*capable)(const struct cred *cred,
> +                       struct user_namespace *ns,
> +                       int cap,
> +                       unsigned int opts);
>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>         int (*quota_on)(struct dentry *dentry);
>         int (*syslog)(int type);
> diff --git a/include/linux/security.h b/include/linux/security.h
> index d170a5b031f3..038e6779948c 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -54,9 +54,12 @@ struct xattr;
>  struct xfrm_sec_ctx;
>  struct mm_struct;
>
> +/* Default (no) options for the capable function */
> +#define SECURITY_CAP_DEFAULT 0x0

bikeshed: maybe we should call this CAP_OPT_* ? (Then this might be
CAP_OPT_NONE?)

>  /* If capable should audit the security request */
> -#define SECURITY_CAP_NOAUDIT 0
> -#define SECURITY_CAP_AUDIT 1
> +#define SECURITY_CAP_NOAUDIT 0x01
> +/* If capable is being called by a setid function */
> +#define SECURITY_CAP_INSETID 0x02

For the 1 and 2 case, can you use BIT(0) and BIT(1) instead? This
makes it clear this is a bitfield here (and does all the type magic
for higher-order bits if we ever get ther).

>  /* LSM Agnostic defines for sb_set_mnt_opts */
>  #define SECURITY_LSM_NATIVE_LABELS     1
> @@ -72,7 +75,7 @@ enum lsm_event {
>
>  /* These functions are in security/commoncap.c */
>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> -                      int cap, int audit);
> +                      int cap, unsigned int opts);
>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>  extern int cap_ptrace_traceme(struct task_struct *parent);
> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
>                     const kernel_cap_t *effective,
>                     const kernel_cap_t *inheritable,
>                     const kernel_cap_t *permitted);
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                       int cap);
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap);
> +int security_capable(const struct cred *cred,
> +                      struct user_namespace *ns,
> +                      int cap,
> +                      unsigned int opts);
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>  int security_quota_on(struct dentry *dentry);
>  int security_syslog(int type);
> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
>  }
>
>  static inline int security_capable(const struct cred *cred,
> -                                  struct user_namespace *ns, int cap)
> +                                  struct user_namespace *ns,
> +                                  int cap,
> +                                  unsigned int opts)
>  {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -static inline int security_capable_noaudit(const struct cred *cred,
> -                                          struct user_namespace *ns, int cap) {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return cap_capable(cred, ns, cap, opts);
>  }
>
>  static inline int security_quotactl(int cmds, int type, int id,
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 1e1c0236f55b..454576743b1b 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
>         int ret;
>
>         rcu_read_lock();
> -       ret = security_capable(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,

One argument for _keeping_ the _noaudit() function as in v3 is that
keeping this one but removing the other seems inconsistent.

>         int ret;
>
>         rcu_read_lock();
> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>  }
>
> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> +static bool ns_capable_common(struct user_namespace *ns,
> +                             int cap,
> +                             unsigned int opts)
>  {
>         int capable;
>
> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>                 BUG();
>         }
>
> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> -                         security_capable_noaudit(current_cred(), ns, cap);
> +       capable = security_capable(current_cred(), ns, cap, opts);
>         if (capable == 0) {
>                 current->flags |= PF_SUPERPRIV;
>                 return true;
> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>   */
>  bool ns_capable(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, true);
> +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
>  }
>  EXPORT_SYMBOL(ns_capable);
>
> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
>   */
>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, false);
> +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
>  }
>  EXPORT_SYMBOL(ns_capable_noaudit);
>
> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>                      int cap)
>  {
> +
>         if (WARN_ON_ONCE(!cap_valid(cap)))
>                 return false;
>
> -       if (security_capable(file->f_cred, ns, cap) == 0)
> +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
>                 return true;
>
>         return false;
> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>  {
>         int ret = 0;  /* An absent tracer adds no restrictions */
>         const struct cred *cred;
> +
>         rcu_read_lock();
>         cred = rcu_dereference(tsk->ptracer_cred);
>         if (cred)
> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> +                                      SECURITY_CAP_NOAUDIT);
>         rcu_read_unlock();
>         return (ret == 0);
>  }
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f2ae2324c232..ddf615eb1bf7 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>          * behavior of privileged children.
>          */
>         if (!task_no_new_privs(current) &&
> -           security_capable_noaudit(current_cred(), current_user_ns(),
> -                                    CAP_SYS_ADMIN) != 0)
> +           security_capable(current_cred(), current_user_ns(),
> +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
>                 return ERR_PTR(-EACCES);
>
>         /* Allocate a new seccomp_filter */
> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> index 253ef6e9d445..0f6dca54b66e 100644
> --- a/security/apparmor/capability.c
> +++ b/security/apparmor/capability.c
> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
>   * profile_capable - test if profile allows use of capability @cap
>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
>   * @cap: capability to test if allowed
> - * @audit: whether an audit record should be generated
> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>   * @sa: audit data (MAY BE NULL indicating no auditing)
>   *
>   * Returns: 0 if allowed else -EPERM
>   */
> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> -                          struct common_audit_data *sa)
> +static int profile_capable(struct aa_profile *profile, int cap,
> +                          unsigned int opts, struct common_audit_data *sa)
>  {
>         int error;
>
> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>         else
>                 error = -EPERM;
>
> -       if (audit == SECURITY_CAP_NOAUDIT) {
> +       if (opts & SECURITY_CAP_NOAUDIT) {
>                 if (!COMPLAIN_MODE(profile))
>                         return error;
>                 /* audit the cap request in complain mode but note that it
> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>   * aa_capable - test permission to use capability
>   * @label: label being tested for capability (NOT NULL)
>   * @cap: capability to be tested
> - * @audit: whether an audit record should be generated
> + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
>   *
>   * Look up capability in profile capability set.
>   *
>   * Returns: 0 on success, or else an error code.
>   */
> -int aa_capable(struct aa_label *label, int cap, int audit)
> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
>  {
>         struct aa_profile *profile;
>         int error = 0;
> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
>
>         sa.u.cap = cap;
>         error = fn_for_each_confined(label, profile,
> -                       profile_capable(profile, cap, audit, &sa));
> +                       profile_capable(profile, cap, opts, &sa));
>
>         return error;
>  }
> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> index e0304e2aeb7f..1b3663b6ab12 100644
> --- a/security/apparmor/include/capability.h
> +++ b/security/apparmor/include/capability.h
> @@ -40,7 +40,7 @@ struct aa_caps {
>
>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
>
> -int aa_capable(struct aa_label *label, int cap, int audit);
> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
>
>  static inline void aa_free_cap_rules(struct aa_caps *caps)
>  {
> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> index 527ea1557120..4a1da2313162 100644
> --- a/security/apparmor/ipc.c
> +++ b/security/apparmor/ipc.c
> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
>         aad(sa)->label = &tracer->label;
>         aad(sa)->peer = tracee;
>         aad(sa)->request = 0;
> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> +                                   SECURITY_CAP_DEFAULT);
>
>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
>  }
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index 42446a216f3b..0bd817084fc1 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>  }
>
>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> -                           int cap, int audit)
> +                           int cap, unsigned int opts)
>  {
>         struct aa_label *label;
>         int error = 0;
>
>         label = aa_get_newest_cred_label(cred);
>         if (!unconfined(label))
> -               error = aa_capable(label, cap, audit);
> +               error = aa_capable(label, cap, opts);
>         aa_put_label(label);
>
>         return error;
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 232db019f051..3d8609192e17 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>   * kernel's capable() and has_capability() returns 1 for this case.
>   */
>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> -               int cap, int audit)
> +               int cap, unsigned int opts)
>  {
>         struct user_namespace *ns = targ_ns;
>
> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>   */
>  static inline int cap_inh_is_capped(void)
>  {
> -
>         /* they are so limited unless the current task has the CAP_SETPCAP
>          * capability
>          */
>         if (cap_capable(current_cred(), current_cred()->user_ns,
> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
>                 return 0;
>         return 1;
>  }
> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>                     || (cap_capable(current_cred(),
> -                                   current_cred()->user_ns, CAP_SETPCAP,
> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> +                                   current_cred()->user_ns,
> +                                   CAP_SETPCAP,
> +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
>                         /*
>                          * [1] no changing of bits that are locked
>                          * [2] no unlocking of locks
> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>  {
>         int cap_sys_admin = 0;
>
> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> -                       SECURITY_CAP_NOAUDIT) == 0)
> +       if (cap_capable(current_cred(), &init_user_ns,
> +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
>                 cap_sys_admin = 1;
> +
>         return cap_sys_admin;
>  }
>
> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
>
>         if (addr < dac_mmap_min_addr) {
>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> -                                 SECURITY_CAP_AUDIT);
> +                                 SECURITY_CAP_DEFAULT);
>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>                 if (ret == 0)
>                         current->flags |= PF_SUPERPRIV;
> diff --git a/security/security.c b/security/security.c
> index d670136dda2c..d2334697797a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>                                 effective, inheritable, permitted);
>  }
>
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                    int cap)
> +int security_capable(const struct cred *cred,
> +                    struct user_namespace *ns,
> +                    int cap,
> +                    unsigned int opts)
>  {
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap)
> -{
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>  }
>
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index a67459eb62d5..a4b2e49213de 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
>
>  /* Check whether a task is allowed to use a capability. */
>  static int cred_has_capability(const struct cred *cred,
> -                              int cap, int audit, bool initns)
> +                              int cap, unsigned int opts, bool initns)
>  {
>         struct common_audit_data ad;
>         struct av_decision avd;
> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
>
>         rc = avc_has_perm_noaudit(&selinux_state,
>                                   sid, sid, sclass, av, 0, &avd);
> -       if (audit == SECURITY_CAP_AUDIT) {
> +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
>                 int rc2 = avc_audit(&selinux_state,
>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
>                 if (rc2)
> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>   */
>
>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> -                          int cap, int audit)
> +                          int cap, unsigned int opts)
>  {
> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
>  }
>
>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
>  static bool has_cap_mac_admin(bool audit)
>  {
>         const struct cred *cred = current_cred();
> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
>
> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
>                 return false;
> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
>                 return false;
>         return true;
>  }
> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
>         case KDSKBENT:
>         case KDSKBSENT:
>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> -                                           SECURITY_CAP_AUDIT, true);
> +                                           SECURITY_CAP_DEFAULT, true);
>                 break;
>
>         /* default case assumes that the command will go
> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> index 9a4c0ad46518..fac2a21aa7d4 100644
> --- a/security/smack/smack_access.c
> +++ b/security/smack/smack_access.c
> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>         struct smack_known_list_elem *sklep;
>         int rc;
>
> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
>         if (rc)
>                 return false;
>
> --
> 2.20.0.405.gbc1bbc6f85-goog
>

Otherwise, this looks fine to me.

Reviewed-by: Kees Cook <keescook@chromium.org>

James, Stephen, thoughts?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v4] LSM: generalize flag passing to security_capable
  2019-01-07 23:13           ` [PATCH v2] " Kees Cook
@ 2019-01-08  0:10             ` mortonm
  2019-01-08  0:20               ` Kees Cook
  2019-01-10 22:31               ` James Morris
  2019-01-08  0:10             ` [PATCH v2] " Micah Morton
  1 sibling, 2 replies; 88+ messages in thread
From: mortonm @ 2019-01-08  0:10 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM).

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Changes since the last patch: Changed the names of SECURITY_CAP_* to
CAP_OPT_* and started using the BIT() macro in the definition of the
bit fields. This v4 patch, like the v2 one, removes the
security_capable_noaudit function (since it seems like we're leaning
toward that option).

 include/linux/lsm_hooks.h              |  8 +++++---
 include/linux/security.h               | 28 +++++++++++++-------------
 kernel/capability.c                    | 22 +++++++++++---------
 kernel/seccomp.c                       |  4 ++--
 security/apparmor/capability.c         | 14 ++++++-------
 security/apparmor/include/capability.h |  2 +-
 security/apparmor/ipc.c                |  3 ++-
 security/apparmor/lsm.c                |  4 ++--
 security/apparmor/resource.c           |  2 +-
 security/commoncap.c                   | 17 ++++++++--------
 security/security.c                    | 14 +++++--------
 security/selinux/hooks.c               | 18 ++++++++---------
 security/smack/smack_access.c          |  2 +-
 13 files changed, 71 insertions(+), 67 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index aaeb7fa24dc4..ef955a44a782 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1270,7 +1270,7 @@
  *	@cred contains the credentials to use.
  *	@ns contains the user namespace we want the capability in
  *	@cap contains the capability <include/linux/capability.h>.
- *	@audit contains whether to write an audit message or not
+ *	@opts contains options for the capable check <include/linux/security.h>
  *	Return 0 if the capability is granted for @tsk.
  * @syslog:
  *	Check permission before accessing the kernel message ring or changing
@@ -1446,8 +1446,10 @@ union security_list_options {
 			const kernel_cap_t *effective,
 			const kernel_cap_t *inheritable,
 			const kernel_cap_t *permitted);
-	int (*capable)(const struct cred *cred, struct user_namespace *ns,
-			int cap, int audit);
+	int (*capable)(const struct cred *cred,
+			struct user_namespace *ns,
+			int cap,
+			unsigned int opts);
 	int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
 	int (*quota_on)(struct dentry *dentry);
 	int (*syslog)(int type);
diff --git a/include/linux/security.h b/include/linux/security.h
index d170a5b031f3..0fe246bfd380 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -54,9 +54,12 @@ struct xattr;
 struct xfrm_sec_ctx;
 struct mm_struct;
 
+/* Default (no) options for the capable function */
+#define CAP_OPT_NONE 0x0
 /* If capable should audit the security request */
-#define SECURITY_CAP_NOAUDIT 0
-#define SECURITY_CAP_AUDIT 1
+#define CAP_OPT_NOAUDIT BIT(1)
+/* If capable is being called by a setid function */
+#define CAP_OPT_INSETID BIT(2)
 
 /* LSM Agnostic defines for sb_set_mnt_opts */
 #define SECURITY_LSM_NATIVE_LABELS	1
@@ -72,7 +75,7 @@ enum lsm_event {
 
 /* These functions are in security/commoncap.c */
 extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
-		       int cap, int audit);
+		       int cap, unsigned int opts);
 extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
 extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
@@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
 		    const kernel_cap_t *effective,
 		    const kernel_cap_t *inheritable,
 		    const kernel_cap_t *permitted);
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-			int cap);
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap);
+int security_capable(const struct cred *cred,
+		       struct user_namespace *ns,
+		       int cap,
+		       unsigned int opts);
 int security_quotactl(int cmds, int type, int id, struct super_block *sb);
 int security_quota_on(struct dentry *dentry);
 int security_syslog(int type);
@@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
 }
 
 static inline int security_capable(const struct cred *cred,
-				   struct user_namespace *ns, int cap)
+				   struct user_namespace *ns,
+				   int cap,
+				   unsigned int opts)
 {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-static inline int security_capable_noaudit(const struct cred *cred,
-					   struct user_namespace *ns, int cap) {
-	return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return cap_capable(cred, ns, cap, opts);
 }
 
 static inline int security_quotactl(int cmds, int type, int id,
diff --git a/kernel/capability.c b/kernel/capability.c
index 1e1c0236f55b..7718d7dcadc7 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
 	int ret;
 
 	rcu_read_lock();
-	ret = security_capable(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NONE);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
 	int ret;
 
 	rcu_read_lock();
-	ret = security_capable_noaudit(__task_cred(t), ns, cap);
+	ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NOAUDIT);
 	rcu_read_unlock();
 
 	return (ret == 0);
@@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
 	return has_ns_capability_noaudit(t, &init_user_ns, cap);
 }
 
-static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
+static bool ns_capable_common(struct user_namespace *ns,
+			      int cap,
+			      unsigned int opts)
 {
 	int capable;
 
@@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
 		BUG();
 	}
 
-	capable = audit ? security_capable(current_cred(), ns, cap) :
-			  security_capable_noaudit(current_cred(), ns, cap);
+	capable = security_capable(current_cred(), ns, cap, opts);
 	if (capable == 0) {
 		current->flags |= PF_SUPERPRIV;
 		return true;
@@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
  */
 bool ns_capable(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, true);
+	return ns_capable_common(ns, cap, CAP_OPT_NONE);
 }
 EXPORT_SYMBOL(ns_capable);
 
@@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
  */
 bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
-	return ns_capable_common(ns, cap, false);
+	return ns_capable_common(ns, cap, CAP_OPT_NOAUDIT);
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
@@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
 bool file_ns_capable(const struct file *file, struct user_namespace *ns,
 		     int cap)
 {
+
 	if (WARN_ON_ONCE(!cap_valid(cap)))
 		return false;
 
-	if (security_capable(file->f_cred, ns, cap) == 0)
+	if (security_capable(file->f_cred, ns, cap, CAP_OPT_NONE) == 0)
 		return true;
 
 	return false;
@@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
 {
 	int ret = 0;  /* An absent tracer adds no restrictions */
 	const struct cred *cred;
+
 	rcu_read_lock();
 	cred = rcu_dereference(tsk->ptracer_cred);
 	if (cred)
-		ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
+		ret = security_capable(cred, ns, CAP_SYS_PTRACE,
+				       CAP_OPT_NOAUDIT);
 	rcu_read_unlock();
 	return (ret == 0);
 }
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index f2ae2324c232..2289c0befc08 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 	 * behavior of privileged children.
 	 */
 	if (!task_no_new_privs(current) &&
-	    security_capable_noaudit(current_cred(), current_user_ns(),
-				     CAP_SYS_ADMIN) != 0)
+	    security_capable(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) != 0)
 		return ERR_PTR(-EACCES);
 
 	/* Allocate a new seccomp_filter */
diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 253ef6e9d445..752f73980e30 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
  * profile_capable - test if profile allows use of capability @cap
  * @profile: profile being enforced    (NOT NULL, NOT unconfined)
  * @cap: capability to test if allowed
- * @audit: whether an audit record should be generated
+ * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
  * @sa: audit data (MAY BE NULL indicating no auditing)
  *
  * Returns: 0 if allowed else -EPERM
  */
-static int profile_capable(struct aa_profile *profile, int cap, int audit,
-			   struct common_audit_data *sa)
+static int profile_capable(struct aa_profile *profile, int cap,
+			   unsigned int opts, struct common_audit_data *sa)
 {
 	int error;
 
@@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
 	else
 		error = -EPERM;
 
-	if (audit == SECURITY_CAP_NOAUDIT) {
+	if (opts & CAP_OPT_NOAUDIT) {
 		if (!COMPLAIN_MODE(profile))
 			return error;
 		/* audit the cap request in complain mode but note that it
@@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
  * aa_capable - test permission to use capability
  * @label: label being tested for capability (NOT NULL)
  * @cap: capability to be tested
- * @audit: whether an audit record should be generated
+ * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
  *
  * Look up capability in profile capability set.
  *
  * Returns: 0 on success, or else an error code.
  */
-int aa_capable(struct aa_label *label, int cap, int audit)
+int aa_capable(struct aa_label *label, int cap, unsigned int opts)
 {
 	struct aa_profile *profile;
 	int error = 0;
@@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
 
 	sa.u.cap = cap;
 	error = fn_for_each_confined(label, profile,
-			profile_capable(profile, cap, audit, &sa));
+			profile_capable(profile, cap, opts, &sa));
 
 	return error;
 }
diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
index e0304e2aeb7f..1b3663b6ab12 100644
--- a/security/apparmor/include/capability.h
+++ b/security/apparmor/include/capability.h
@@ -40,7 +40,7 @@ struct aa_caps {
 
 extern struct aa_sfs_entry aa_sfs_entry_caps[];
 
-int aa_capable(struct aa_label *label, int cap, int audit);
+int aa_capable(struct aa_label *label, int cap, unsigned int opts);
 
 static inline void aa_free_cap_rules(struct aa_caps *caps)
 {
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index 527ea1557120..aacd1e95cb59 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
 	aad(sa)->label = &tracer->label;
 	aad(sa)->peer = tracee;
 	aad(sa)->request = 0;
-	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
+	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
+				    CAP_OPT_NONE);
 
 	return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
 }
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 42446a216f3b..0bd817084fc1 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
 }
 
 static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
-			    int cap, int audit)
+			    int cap, unsigned int opts)
 {
 	struct aa_label *label;
 	int error = 0;
 
 	label = aa_get_newest_cred_label(cred);
 	if (!unconfined(label))
-		error = aa_capable(label, cap, audit);
+		error = aa_capable(label, cap, opts);
 	aa_put_label(label);
 
 	return error;
diff --git a/security/apparmor/resource.c b/security/apparmor/resource.c
index 95fd26d09757..552ed09cb47e 100644
--- a/security/apparmor/resource.c
+++ b/security/apparmor/resource.c
@@ -124,7 +124,7 @@ int aa_task_setrlimit(struct aa_label *label, struct task_struct *task,
 	 */
 
 	if (label != peer &&
-	    aa_capable(label, CAP_SYS_RESOURCE, SECURITY_CAP_NOAUDIT) != 0)
+	    aa_capable(label, CAP_SYS_RESOURCE, CAP_OPT_NOAUDIT) != 0)
 		error = fn_for_each(label, profile,
 				audit_resource(profile, resource,
 					       new_rlim->rlim_max, peer,
diff --git a/security/commoncap.c b/security/commoncap.c
index 232db019f051..13f03622f694 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
  * kernel's capable() and has_capability() returns 1 for this case.
  */
 int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
-		int cap, int audit)
+		int cap, unsigned int opts)
 {
 	struct user_namespace *ns = targ_ns;
 
@@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
  */
 static inline int cap_inh_is_capped(void)
 {
-
 	/* they are so limited unless the current task has the CAP_SETPCAP
 	 * capability
 	 */
 	if (cap_capable(current_cred(), current_cred()->user_ns,
-			CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
+			CAP_SETPCAP, CAP_OPT_NONE) == 0)
 		return 0;
 	return 1;
 }
@@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 		    || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))	/*[2]*/
 		    || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))	/*[3]*/
 		    || (cap_capable(current_cred(),
-				    current_cred()->user_ns, CAP_SETPCAP,
-				    SECURITY_CAP_AUDIT) != 0)		/*[4]*/
+				    current_cred()->user_ns,
+				    CAP_SETPCAP,
+				    CAP_OPT_NONE) != 0)			/*[4]*/
 			/*
 			 * [1] no changing of bits that are locked
 			 * [2] no unlocking of locks
@@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
 {
 	int cap_sys_admin = 0;
 
-	if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
-			SECURITY_CAP_NOAUDIT) == 0)
+	if (cap_capable(current_cred(), &init_user_ns,
+				CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) == 0)
 		cap_sys_admin = 1;
+
 	return cap_sys_admin;
 }
 
@@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
 
 	if (addr < dac_mmap_min_addr) {
 		ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
-				  SECURITY_CAP_AUDIT);
+				  CAP_OPT_NONE);
 		/* set PF_SUPERPRIV if it turns out we allow the low mmap */
 		if (ret == 0)
 			current->flags |= PF_SUPERPRIV;
diff --git a/security/security.c b/security/security.c
index d670136dda2c..d2334697797a 100644
--- a/security/security.c
+++ b/security/security.c
@@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
 				effective, inheritable, permitted);
 }
 
-int security_capable(const struct cred *cred, struct user_namespace *ns,
-		     int cap)
+int security_capable(const struct cred *cred,
+		     struct user_namespace *ns,
+		     int cap,
+		     unsigned int opts)
 {
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
-}
-
-int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
-			     int cap)
-{
-	return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
+	return call_int_hook(capable, 0, cred, ns, cap, opts);
 }
 
 int security_quotactl(int cmds, int type, int id, struct super_block *sb)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a67459eb62d5..abcee2874bad 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
 
 /* Check whether a task is allowed to use a capability. */
 static int cred_has_capability(const struct cred *cred,
-			       int cap, int audit, bool initns)
+			       int cap, unsigned int opts, bool initns)
 {
 	struct common_audit_data ad;
 	struct av_decision avd;
@@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
 
 	rc = avc_has_perm_noaudit(&selinux_state,
 				  sid, sid, sclass, av, 0, &avd);
-	if (audit == SECURITY_CAP_AUDIT) {
+	if (!(opts & CAP_OPT_NOAUDIT)) {
 		int rc2 = avc_audit(&selinux_state,
 				    sid, sid, sclass, av, &avd, rc, &ad, 0);
 		if (rc2)
@@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
  */
 
 static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
-			   int cap, int audit)
+			   int cap, unsigned int opts)
 {
-	return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
+	return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
 }
 
 static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
@@ -2392,7 +2392,7 @@ static int selinux_vm_enough_memory(struct mm_struct *mm, long pages)
 	int rc, cap_sys_admin = 0;
 
 	rc = cred_has_capability(current_cred(), CAP_SYS_ADMIN,
-				 SECURITY_CAP_NOAUDIT, true);
+				 CAP_OPT_NOAUDIT, true);
 	if (rc == 0)
 		cap_sys_admin = 1;
 
@@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
 static bool has_cap_mac_admin(bool audit)
 {
 	const struct cred *cred = current_cred();
-	int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
+	unsigned int opts = audit ? CAP_OPT_NONE : CAP_OPT_NOAUDIT;
 
-	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
+	if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
 		return false;
-	if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
+	if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
 		return false;
 	return true;
 }
@@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
 	case KDSKBENT:
 	case KDSKBSENT:
 		error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
-					    SECURITY_CAP_AUDIT, true);
+					    CAP_OPT_NONE, true);
 		break;
 
 	/* default case assumes that the command will go
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index 9a4c0ad46518..ae6c994d11d0 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
 	struct smack_known_list_elem *sklep;
 	int rc;
 
-	rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
+	rc = cap_capable(cred, &init_user_ns, cap, CAP_OPT_NONE);
 	if (rc)
 		return false;
 
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: generalize flag passing to security_capable
  2019-01-07 23:13           ` [PATCH v2] " Kees Cook
  2019-01-08  0:10             ` [PATCH v4] " mortonm
@ 2019-01-08  0:10             ` Micah Morton
  1 sibling, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-08  0:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: James Morris, Stephen Smalley, Serge E. Hallyn, Casey Schaufler,
	linux-security-module

On Mon, Jan 7, 2019 at 3:13 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Dec 18, 2018 at 2:37 PM <mortonm@chromium.org> wrote:
> >
> > From: Micah Morton <mortonm@chromium.org>
> >
> > This patch provides a general mechanism for passing flags to the
> > security_capable LSM hook. It replaces the specific 'audit' flag that is
> > used to tell security_capable whether it should log an audit message for
> > the given capability check. The reason for generalizing this flag
> > passing is so we can add an additional flag that signifies whether
> > security_capable is being called by a setid syscall (which is needed by
> > the proposed SafeSetID LSM).
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> > Changes since the last patch: Changed the code to use a bitmask instead
> > of a struct to represent the options passed to security_capable.
>
> FWIW, I too prefer this v2 patch. I don't see a reason to keep an
> "option-ified" function around if it's been generalized into a
> bitfield argument.
>
> >  include/linux/lsm_hooks.h              |  8 +++++---
> >  include/linux/security.h               | 28 +++++++++++++-------------
> >  kernel/capability.c                    | 22 +++++++++++---------
> >  kernel/seccomp.c                       |  4 ++--
> >  security/apparmor/capability.c         | 14 ++++++-------
> >  security/apparmor/include/capability.h |  2 +-
> >  security/apparmor/ipc.c                |  3 ++-
> >  security/apparmor/lsm.c                |  4 ++--
> >  security/commoncap.c                   | 17 ++++++++--------
> >  security/security.c                    | 14 +++++--------
> >  security/selinux/hooks.c               | 16 +++++++--------
> >  security/smack/smack_access.c          |  2 +-
> >  12 files changed, 69 insertions(+), 65 deletions(-)
> >
> > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > index aaeb7fa24dc4..ef955a44a782 100644
> > --- a/include/linux/lsm_hooks.h
> > +++ b/include/linux/lsm_hooks.h
> > @@ -1270,7 +1270,7 @@
> >   *     @cred contains the credentials to use.
> >   *     @ns contains the user namespace we want the capability in
> >   *     @cap contains the capability <include/linux/capability.h>.
> > - *     @audit contains whether to write an audit message or not
> > + *     @opts contains options for the capable check <include/linux/security.h>
> >   *     Return 0 if the capability is granted for @tsk.
> >   * @syslog:
> >   *     Check permission before accessing the kernel message ring or changing
> > @@ -1446,8 +1446,10 @@ union security_list_options {
> >                         const kernel_cap_t *effective,
> >                         const kernel_cap_t *inheritable,
> >                         const kernel_cap_t *permitted);
> > -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> > -                       int cap, int audit);
> > +       int (*capable)(const struct cred *cred,
> > +                       struct user_namespace *ns,
> > +                       int cap,
> > +                       unsigned int opts);
> >         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
> >         int (*quota_on)(struct dentry *dentry);
> >         int (*syslog)(int type);
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index d170a5b031f3..038e6779948c 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -54,9 +54,12 @@ struct xattr;
> >  struct xfrm_sec_ctx;
> >  struct mm_struct;
> >
> > +/* Default (no) options for the capable function */
> > +#define SECURITY_CAP_DEFAULT 0x0
>
> bikeshed: maybe we should call this CAP_OPT_* ? (Then this might be
> CAP_OPT_NONE?)

I agree, I like those names better.

>
> >  /* If capable should audit the security request */
> > -#define SECURITY_CAP_NOAUDIT 0
> > -#define SECURITY_CAP_AUDIT 1
> > +#define SECURITY_CAP_NOAUDIT 0x01
> > +/* If capable is being called by a setid function */
> > +#define SECURITY_CAP_INSETID 0x02
>
> For the 1 and 2 case, can you use BIT(0) and BIT(1) instead? This
> makes it clear this is a bitfield here (and does all the type magic
> for higher-order bits if we ever get ther).

Done.

>
> >  /* LSM Agnostic defines for sb_set_mnt_opts */
> >  #define SECURITY_LSM_NATIVE_LABELS     1
> > @@ -72,7 +75,7 @@ enum lsm_event {
> >
> >  /* These functions are in security/commoncap.c */
> >  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> > -                      int cap, int audit);
> > +                      int cap, unsigned int opts);
> >  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
> >  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
> >  extern int cap_ptrace_traceme(struct task_struct *parent);
> > @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
> >                     const kernel_cap_t *effective,
> >                     const kernel_cap_t *inheritable,
> >                     const kernel_cap_t *permitted);
> > -int security_capable(const struct cred *cred, struct user_namespace *ns,
> > -                       int cap);
> > -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> > -                            int cap);
> > +int security_capable(const struct cred *cred,
> > +                      struct user_namespace *ns,
> > +                      int cap,
> > +                      unsigned int opts);
> >  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
> >  int security_quota_on(struct dentry *dentry);
> >  int security_syslog(int type);
> > @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
> >  }
> >
> >  static inline int security_capable(const struct cred *cred,
> > -                                  struct user_namespace *ns, int cap)
> > +                                  struct user_namespace *ns,
> > +                                  int cap,
> > +                                  unsigned int opts)
> >  {
> > -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> > -}
> > -
> > -static inline int security_capable_noaudit(const struct cred *cred,
> > -                                          struct user_namespace *ns, int cap) {
> > -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> > +       return cap_capable(cred, ns, cap, opts);
> >  }
> >
> >  static inline int security_quotactl(int cmds, int type, int id,
> > diff --git a/kernel/capability.c b/kernel/capability.c
> > index 1e1c0236f55b..454576743b1b 100644
> > --- a/kernel/capability.c
> > +++ b/kernel/capability.c
> > @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
> >         int ret;
> >
> >         rcu_read_lock();
> > -       ret = security_capable(__task_cred(t), ns, cap);
> > +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_DEFAULT);
> >         rcu_read_unlock();
> >
> >         return (ret == 0);
> > @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>
> One argument for _keeping_ the _noaudit() function as in v3 is that
> keeping this one but removing the other seems inconsistent.

Hmm yeah. Removing the function still seems like the lesser evil to me
but I see what you mean.


>
> >         int ret;
> >
> >         rcu_read_lock();
> > -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> > +       ret = security_capable(__task_cred(t), ns, cap, SECURITY_CAP_NOAUDIT);
> >         rcu_read_unlock();
> >
> >         return (ret == 0);
> > @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
> >         return has_ns_capability_noaudit(t, &init_user_ns, cap);
> >  }
> >
> > -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> > +static bool ns_capable_common(struct user_namespace *ns,
> > +                             int cap,
> > +                             unsigned int opts)
> >  {
> >         int capable;
> >
> > @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >                 BUG();
> >         }
> >
> > -       capable = audit ? security_capable(current_cred(), ns, cap) :
> > -                         security_capable_noaudit(current_cred(), ns, cap);
> > +       capable = security_capable(current_cred(), ns, cap, opts);
> >         if (capable == 0) {
> >                 current->flags |= PF_SUPERPRIV;
> >                 return true;
> > @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >   */
> >  bool ns_capable(struct user_namespace *ns, int cap)
> >  {
> > -       return ns_capable_common(ns, cap, true);
> > +       return ns_capable_common(ns, cap, SECURITY_CAP_DEFAULT);
> >  }
> >  EXPORT_SYMBOL(ns_capable);
> >
> > @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
> >   */
> >  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> >  {
> > -       return ns_capable_common(ns, cap, false);
> > +       return ns_capable_common(ns, cap, SECURITY_CAP_NOAUDIT);
> >  }
> >  EXPORT_SYMBOL(ns_capable_noaudit);
> >
> > @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
> >  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
> >                      int cap)
> >  {
> > +
> >         if (WARN_ON_ONCE(!cap_valid(cap)))
> >                 return false;
> >
> > -       if (security_capable(file->f_cred, ns, cap) == 0)
> > +       if (security_capable(file->f_cred, ns, cap, SECURITY_CAP_DEFAULT) == 0)
> >                 return true;
> >
> >         return false;
> > @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
> >  {
> >         int ret = 0;  /* An absent tracer adds no restrictions */
> >         const struct cred *cred;
> > +
> >         rcu_read_lock();
> >         cred = rcu_dereference(tsk->ptracer_cred);
> >         if (cred)
> > -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> > +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> > +                                      SECURITY_CAP_NOAUDIT);
> >         rcu_read_unlock();
> >         return (ret == 0);
> >  }
> > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > index f2ae2324c232..ddf615eb1bf7 100644
> > --- a/kernel/seccomp.c
> > +++ b/kernel/seccomp.c
> > @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >          * behavior of privileged children.
> >          */
> >         if (!task_no_new_privs(current) &&
> > -           security_capable_noaudit(current_cred(), current_user_ns(),
> > -                                    CAP_SYS_ADMIN) != 0)
> > +           security_capable(current_cred(), current_user_ns(),
> > +                                    CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) != 0)
> >                 return ERR_PTR(-EACCES);
> >
> >         /* Allocate a new seccomp_filter */
> > diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> > index 253ef6e9d445..0f6dca54b66e 100644
> > --- a/security/apparmor/capability.c
> > +++ b/security/apparmor/capability.c
> > @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
> >   * profile_capable - test if profile allows use of capability @cap
> >   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
> >   * @cap: capability to test if allowed
> > - * @audit: whether an audit record should be generated
> > + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
> >   * @sa: audit data (MAY BE NULL indicating no auditing)
> >   *
> >   * Returns: 0 if allowed else -EPERM
> >   */
> > -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> > -                          struct common_audit_data *sa)
> > +static int profile_capable(struct aa_profile *profile, int cap,
> > +                          unsigned int opts, struct common_audit_data *sa)
> >  {
> >         int error;
> >
> > @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >         else
> >                 error = -EPERM;
> >
> > -       if (audit == SECURITY_CAP_NOAUDIT) {
> > +       if (opts & SECURITY_CAP_NOAUDIT) {
> >                 if (!COMPLAIN_MODE(profile))
> >                         return error;
> >                 /* audit the cap request in complain mode but note that it
> > @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >   * aa_capable - test permission to use capability
> >   * @label: label being tested for capability (NOT NULL)
> >   * @cap: capability to be tested
> > - * @audit: whether an audit record should be generated
> > + * @opts: SECURITY_CAP_NOAUDIT bit determines whether audit record is generated
> >   *
> >   * Look up capability in profile capability set.
> >   *
> >   * Returns: 0 on success, or else an error code.
> >   */
> > -int aa_capable(struct aa_label *label, int cap, int audit)
> > +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
> >  {
> >         struct aa_profile *profile;
> >         int error = 0;
> > @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
> >
> >         sa.u.cap = cap;
> >         error = fn_for_each_confined(label, profile,
> > -                       profile_capable(profile, cap, audit, &sa));
> > +                       profile_capable(profile, cap, opts, &sa));
> >
> >         return error;
> >  }
> > diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> > index e0304e2aeb7f..1b3663b6ab12 100644
> > --- a/security/apparmor/include/capability.h
> > +++ b/security/apparmor/include/capability.h
> > @@ -40,7 +40,7 @@ struct aa_caps {
> >
> >  extern struct aa_sfs_entry aa_sfs_entry_caps[];
> >
> > -int aa_capable(struct aa_label *label, int cap, int audit);
> > +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
> >
> >  static inline void aa_free_cap_rules(struct aa_caps *caps)
> >  {
> > diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> > index 527ea1557120..4a1da2313162 100644
> > --- a/security/apparmor/ipc.c
> > +++ b/security/apparmor/ipc.c
> > @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
> >         aad(sa)->label = &tracer->label;
> >         aad(sa)->peer = tracee;
> >         aad(sa)->request = 0;
> > -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> > +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> > +                                   SECURITY_CAP_DEFAULT);
> >
> >         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
> >  }
> > diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> > index 42446a216f3b..0bd817084fc1 100644
> > --- a/security/apparmor/lsm.c
> > +++ b/security/apparmor/lsm.c
> > @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
> >  }
> >
> >  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> > -                           int cap, int audit)
> > +                           int cap, unsigned int opts)
> >  {
> >         struct aa_label *label;
> >         int error = 0;
> >
> >         label = aa_get_newest_cred_label(cred);
> >         if (!unconfined(label))
> > -               error = aa_capable(label, cap, audit);
> > +               error = aa_capable(label, cap, opts);
> >         aa_put_label(label);
> >
> >         return error;
> > diff --git a/security/commoncap.c b/security/commoncap.c
> > index 232db019f051..3d8609192e17 100644
> > --- a/security/commoncap.c
> > +++ b/security/commoncap.c
> > @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
> >   * kernel's capable() and has_capability() returns 1 for this case.
> >   */
> >  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> > -               int cap, int audit)
> > +               int cap, unsigned int opts)
> >  {
> >         struct user_namespace *ns = targ_ns;
> >
> > @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
> >   */
> >  static inline int cap_inh_is_capped(void)
> >  {
> > -
> >         /* they are so limited unless the current task has the CAP_SETPCAP
> >          * capability
> >          */
> >         if (cap_capable(current_cred(), current_cred()->user_ns,
> > -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> > +                       CAP_SETPCAP, SECURITY_CAP_DEFAULT) == 0)
> >                 return 0;
> >         return 1;
> >  }
> > @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
> >                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
> >                     || (cap_capable(current_cred(),
> > -                                   current_cred()->user_ns, CAP_SETPCAP,
> > -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> > +                                   current_cred()->user_ns,
> > +                                   CAP_SETPCAP,
> > +                                   SECURITY_CAP_DEFAULT) != 0)         /*[4]*/
> >                         /*
> >                          * [1] no changing of bits that are locked
> >                          * [2] no unlocking of locks
> > @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
> >  {
> >         int cap_sys_admin = 0;
> >
> > -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> > -                       SECURITY_CAP_NOAUDIT) == 0)
> > +       if (cap_capable(current_cred(), &init_user_ns,
> > +                               CAP_SYS_ADMIN, SECURITY_CAP_NOAUDIT) == 0)
> >                 cap_sys_admin = 1;
> > +
> >         return cap_sys_admin;
> >  }
> >
> > @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
> >
> >         if (addr < dac_mmap_min_addr) {
> >                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> > -                                 SECURITY_CAP_AUDIT);
> > +                                 SECURITY_CAP_DEFAULT);
> >                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
> >                 if (ret == 0)
> >                         current->flags |= PF_SUPERPRIV;
> > diff --git a/security/security.c b/security/security.c
> > index d670136dda2c..d2334697797a 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
> >                                 effective, inheritable, permitted);
> >  }
> >
> > -int security_capable(const struct cred *cred, struct user_namespace *ns,
> > -                    int cap)
> > +int security_capable(const struct cred *cred,
> > +                    struct user_namespace *ns,
> > +                    int cap,
> > +                    unsigned int opts)
> >  {
> > -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> > -}
> > -
> > -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> > -                            int cap)
> > -{
> > -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> > +       return call_int_hook(capable, 0, cred, ns, cap, opts);
> >  }
> >
> >  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > index a67459eb62d5..a4b2e49213de 100644
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
> >
> >  /* Check whether a task is allowed to use a capability. */
> >  static int cred_has_capability(const struct cred *cred,
> > -                              int cap, int audit, bool initns)
> > +                              int cap, unsigned int opts, bool initns)
> >  {
> >         struct common_audit_data ad;
> >         struct av_decision avd;
> > @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
> >
> >         rc = avc_has_perm_noaudit(&selinux_state,
> >                                   sid, sid, sclass, av, 0, &avd);
> > -       if (audit == SECURITY_CAP_AUDIT) {
> > +       if (!(opts & SECURITY_CAP_NOAUDIT)) {
> >                 int rc2 = avc_audit(&selinux_state,
> >                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
> >                 if (rc2)
> > @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
> >   */
> >
> >  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> > -                          int cap, int audit)
> > +                          int cap, unsigned int opts)
> >  {
> > -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> > +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
> >  }
> >
> >  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> > @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
> >  static bool has_cap_mac_admin(bool audit)
> >  {
> >         const struct cred *cred = current_cred();
> > -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> > +       unsigned int opts = audit ? SECURITY_CAP_DEFAULT : SECURITY_CAP_NOAUDIT;
> >
> > -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> > +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
> >                 return false;
> > -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> > +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
> >                 return false;
> >         return true;
> >  }
> > @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
> >         case KDSKBENT:
> >         case KDSKBSENT:
> >                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> > -                                           SECURITY_CAP_AUDIT, true);
> > +                                           SECURITY_CAP_DEFAULT, true);
> >                 break;
> >
> >         /* default case assumes that the command will go
> > diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> > index 9a4c0ad46518..fac2a21aa7d4 100644
> > --- a/security/smack/smack_access.c
> > +++ b/security/smack/smack_access.c
> > @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
> >         struct smack_known_list_elem *sklep;
> >         int rc;
> >
> > -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> > +       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_DEFAULT);
> >         if (rc)
> >                 return false;
> >
> > --
> > 2.20.0.405.gbc1bbc6f85-goog
> >
>
> Otherwise, this looks fine to me.
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
>
> James, Stephen, thoughts?
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4] LSM: generalize flag passing to security_capable
  2019-01-08  0:10             ` [PATCH v4] " mortonm
@ 2019-01-08  0:20               ` Kees Cook
  2019-01-09 18:39                 ` Micah Morton
  2019-01-10 22:31               ` James Morris
  1 sibling, 1 reply; 88+ messages in thread
From: Kees Cook @ 2019-01-08  0:20 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Mon, Jan 7, 2019 at 4:11 PM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> This patch provides a general mechanism for passing flags to the
> security_capable LSM hook. It replaces the specific 'audit' flag that is
> used to tell security_capable whether it should log an audit message for
> the given capability check. The reason for generalizing this flag
> passing is so we can add an additional flag that signifies whether
> security_capable is being called by a setid syscall (which is needed by
> the proposed SafeSetID LSM).
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
> Changes since the last patch: Changed the names of SECURITY_CAP_* to
> CAP_OPT_* and started using the BIT() macro in the definition of the
> bit fields. This v4 patch, like the v2 one, removes the
> security_capable_noaudit function (since it seems like we're leaning
> toward that option).
>
>  include/linux/lsm_hooks.h              |  8 +++++---
>  include/linux/security.h               | 28 +++++++++++++-------------
>  kernel/capability.c                    | 22 +++++++++++---------
>  kernel/seccomp.c                       |  4 ++--
>  security/apparmor/capability.c         | 14 ++++++-------
>  security/apparmor/include/capability.h |  2 +-
>  security/apparmor/ipc.c                |  3 ++-
>  security/apparmor/lsm.c                |  4 ++--
>  security/apparmor/resource.c           |  2 +-
>  security/commoncap.c                   | 17 ++++++++--------
>  security/security.c                    | 14 +++++--------
>  security/selinux/hooks.c               | 18 ++++++++---------
>  security/smack/smack_access.c          |  2 +-
>  13 files changed, 71 insertions(+), 67 deletions(-)
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index aaeb7fa24dc4..ef955a44a782 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1270,7 +1270,7 @@
>   *     @cred contains the credentials to use.
>   *     @ns contains the user namespace we want the capability in
>   *     @cap contains the capability <include/linux/capability.h>.
> - *     @audit contains whether to write an audit message or not
> + *     @opts contains options for the capable check <include/linux/security.h>
>   *     Return 0 if the capability is granted for @tsk.
>   * @syslog:
>   *     Check permission before accessing the kernel message ring or changing
> @@ -1446,8 +1446,10 @@ union security_list_options {
>                         const kernel_cap_t *effective,
>                         const kernel_cap_t *inheritable,
>                         const kernel_cap_t *permitted);
> -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> -                       int cap, int audit);
> +       int (*capable)(const struct cred *cred,
> +                       struct user_namespace *ns,
> +                       int cap,
> +                       unsigned int opts);
>         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
>         int (*quota_on)(struct dentry *dentry);
>         int (*syslog)(int type);
> diff --git a/include/linux/security.h b/include/linux/security.h
> index d170a5b031f3..0fe246bfd380 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -54,9 +54,12 @@ struct xattr;
>  struct xfrm_sec_ctx;
>  struct mm_struct;
>
> +/* Default (no) options for the capable function */
> +#define CAP_OPT_NONE 0x0
>  /* If capable should audit the security request */
> -#define SECURITY_CAP_NOAUDIT 0
> -#define SECURITY_CAP_AUDIT 1
> +#define CAP_OPT_NOAUDIT BIT(1)
> +/* If capable is being called by a setid function */
> +#define CAP_OPT_INSETID BIT(2)
>
>  /* LSM Agnostic defines for sb_set_mnt_opts */
>  #define SECURITY_LSM_NATIVE_LABELS     1
> @@ -72,7 +75,7 @@ enum lsm_event {
>
>  /* These functions are in security/commoncap.c */
>  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> -                      int cap, int audit);
> +                      int cap, unsigned int opts);
>  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
>  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>  extern int cap_ptrace_traceme(struct task_struct *parent);
> @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
>                     const kernel_cap_t *effective,
>                     const kernel_cap_t *inheritable,
>                     const kernel_cap_t *permitted);
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                       int cap);
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap);
> +int security_capable(const struct cred *cred,
> +                      struct user_namespace *ns,
> +                      int cap,
> +                      unsigned int opts);
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
>  int security_quota_on(struct dentry *dentry);
>  int security_syslog(int type);
> @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
>  }
>
>  static inline int security_capable(const struct cred *cred,
> -                                  struct user_namespace *ns, int cap)
> +                                  struct user_namespace *ns,
> +                                  int cap,
> +                                  unsigned int opts)
>  {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -static inline int security_capable_noaudit(const struct cred *cred,
> -                                          struct user_namespace *ns, int cap) {
> -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return cap_capable(cred, ns, cap, opts);
>  }
>
>  static inline int security_quotactl(int cmds, int type, int id,
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 1e1c0236f55b..7718d7dcadc7 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
>         int ret;
>
>         rcu_read_lock();
> -       ret = security_capable(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NONE);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
>         int ret;
>
>         rcu_read_lock();
> -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> +       ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NOAUDIT);
>         rcu_read_unlock();
>
>         return (ret == 0);
> @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>         return has_ns_capability_noaudit(t, &init_user_ns, cap);
>  }
>
> -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> +static bool ns_capable_common(struct user_namespace *ns,
> +                             int cap,
> +                             unsigned int opts)
>  {
>         int capable;
>
> @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>                 BUG();
>         }
>
> -       capable = audit ? security_capable(current_cred(), ns, cap) :
> -                         security_capable_noaudit(current_cred(), ns, cap);
> +       capable = security_capable(current_cred(), ns, cap, opts);
>         if (capable == 0) {
>                 current->flags |= PF_SUPERPRIV;
>                 return true;
> @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
>   */
>  bool ns_capable(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, true);
> +       return ns_capable_common(ns, cap, CAP_OPT_NONE);
>  }
>  EXPORT_SYMBOL(ns_capable);
>
> @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
>   */
>  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  {
> -       return ns_capable_common(ns, cap, false);
> +       return ns_capable_common(ns, cap, CAP_OPT_NOAUDIT);
>  }
>  EXPORT_SYMBOL(ns_capable_noaudit);
>
> @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
>  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
>                      int cap)
>  {
> +
>         if (WARN_ON_ONCE(!cap_valid(cap)))
>                 return false;
>
> -       if (security_capable(file->f_cred, ns, cap) == 0)
> +       if (security_capable(file->f_cred, ns, cap, CAP_OPT_NONE) == 0)
>                 return true;
>
>         return false;
> @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
>  {
>         int ret = 0;  /* An absent tracer adds no restrictions */
>         const struct cred *cred;
> +
>         rcu_read_lock();
>         cred = rcu_dereference(tsk->ptracer_cred);
>         if (cred)
> -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> +                                      CAP_OPT_NOAUDIT);
>         rcu_read_unlock();
>         return (ret == 0);
>  }
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index f2ae2324c232..2289c0befc08 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>          * behavior of privileged children.
>          */
>         if (!task_no_new_privs(current) &&
> -           security_capable_noaudit(current_cred(), current_user_ns(),
> -                                    CAP_SYS_ADMIN) != 0)
> +           security_capable(current_cred(), current_user_ns(),
> +                                    CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) != 0)
>                 return ERR_PTR(-EACCES);
>
>         /* Allocate a new seccomp_filter */
> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> index 253ef6e9d445..752f73980e30 100644
> --- a/security/apparmor/capability.c
> +++ b/security/apparmor/capability.c
> @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
>   * profile_capable - test if profile allows use of capability @cap
>   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
>   * @cap: capability to test if allowed
> - * @audit: whether an audit record should be generated
> + * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
>   * @sa: audit data (MAY BE NULL indicating no auditing)
>   *
>   * Returns: 0 if allowed else -EPERM
>   */
> -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> -                          struct common_audit_data *sa)
> +static int profile_capable(struct aa_profile *profile, int cap,
> +                          unsigned int opts, struct common_audit_data *sa)
>  {
>         int error;
>
> @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>         else
>                 error = -EPERM;
>
> -       if (audit == SECURITY_CAP_NOAUDIT) {
> +       if (opts & CAP_OPT_NOAUDIT) {
>                 if (!COMPLAIN_MODE(profile))
>                         return error;
>                 /* audit the cap request in complain mode but note that it
> @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
>   * aa_capable - test permission to use capability
>   * @label: label being tested for capability (NOT NULL)
>   * @cap: capability to be tested
> - * @audit: whether an audit record should be generated
> + * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
>   *
>   * Look up capability in profile capability set.
>   *
>   * Returns: 0 on success, or else an error code.
>   */
> -int aa_capable(struct aa_label *label, int cap, int audit)
> +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
>  {
>         struct aa_profile *profile;
>         int error = 0;
> @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
>
>         sa.u.cap = cap;
>         error = fn_for_each_confined(label, profile,
> -                       profile_capable(profile, cap, audit, &sa));
> +                       profile_capable(profile, cap, opts, &sa));
>
>         return error;
>  }
> diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> index e0304e2aeb7f..1b3663b6ab12 100644
> --- a/security/apparmor/include/capability.h
> +++ b/security/apparmor/include/capability.h
> @@ -40,7 +40,7 @@ struct aa_caps {
>
>  extern struct aa_sfs_entry aa_sfs_entry_caps[];
>
> -int aa_capable(struct aa_label *label, int cap, int audit);
> +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
>
>  static inline void aa_free_cap_rules(struct aa_caps *caps)
>  {
> diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> index 527ea1557120..aacd1e95cb59 100644
> --- a/security/apparmor/ipc.c
> +++ b/security/apparmor/ipc.c
> @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
>         aad(sa)->label = &tracer->label;
>         aad(sa)->peer = tracee;
>         aad(sa)->request = 0;
> -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> +                                   CAP_OPT_NONE);
>
>         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
>  }
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index 42446a216f3b..0bd817084fc1 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
>  }
>
>  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> -                           int cap, int audit)
> +                           int cap, unsigned int opts)
>  {
>         struct aa_label *label;
>         int error = 0;
>
>         label = aa_get_newest_cred_label(cred);
>         if (!unconfined(label))
> -               error = aa_capable(label, cap, audit);
> +               error = aa_capable(label, cap, opts);
>         aa_put_label(label);
>
>         return error;
> diff --git a/security/apparmor/resource.c b/security/apparmor/resource.c
> index 95fd26d09757..552ed09cb47e 100644
> --- a/security/apparmor/resource.c
> +++ b/security/apparmor/resource.c
> @@ -124,7 +124,7 @@ int aa_task_setrlimit(struct aa_label *label, struct task_struct *task,
>          */
>
>         if (label != peer &&
> -           aa_capable(label, CAP_SYS_RESOURCE, SECURITY_CAP_NOAUDIT) != 0)
> +           aa_capable(label, CAP_SYS_RESOURCE, CAP_OPT_NOAUDIT) != 0)
>                 error = fn_for_each(label, profile,
>                                 audit_resource(profile, resource,
>                                                new_rlim->rlim_max, peer,
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 232db019f051..13f03622f694 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
>   * kernel's capable() and has_capability() returns 1 for this case.
>   */
>  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> -               int cap, int audit)
> +               int cap, unsigned int opts)
>  {
>         struct user_namespace *ns = targ_ns;
>
> @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
>   */
>  static inline int cap_inh_is_capped(void)
>  {
> -
>         /* they are so limited unless the current task has the CAP_SETPCAP
>          * capability
>          */
>         if (cap_capable(current_cred(), current_cred()->user_ns,
> -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> +                       CAP_SETPCAP, CAP_OPT_NONE) == 0)
>                 return 0;
>         return 1;
>  }
> @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
>                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
>                     || (cap_capable(current_cred(),
> -                                   current_cred()->user_ns, CAP_SETPCAP,
> -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> +                                   current_cred()->user_ns,
> +                                   CAP_SETPCAP,
> +                                   CAP_OPT_NONE) != 0)                 /*[4]*/
>                         /*
>                          * [1] no changing of bits that are locked
>                          * [2] no unlocking of locks
> @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
>  {
>         int cap_sys_admin = 0;
>
> -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> -                       SECURITY_CAP_NOAUDIT) == 0)
> +       if (cap_capable(current_cred(), &init_user_ns,
> +                               CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) == 0)
>                 cap_sys_admin = 1;
> +
>         return cap_sys_admin;
>  }
>
> @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
>
>         if (addr < dac_mmap_min_addr) {
>                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> -                                 SECURITY_CAP_AUDIT);
> +                                 CAP_OPT_NONE);
>                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
>                 if (ret == 0)
>                         current->flags |= PF_SUPERPRIV;
> diff --git a/security/security.c b/security/security.c
> index d670136dda2c..d2334697797a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
>                                 effective, inheritable, permitted);
>  }
>
> -int security_capable(const struct cred *cred, struct user_namespace *ns,
> -                    int cap)
> +int security_capable(const struct cred *cred,
> +                    struct user_namespace *ns,
> +                    int cap,
> +                    unsigned int opts)
>  {
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> -}
> -
> -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> -                            int cap)
> -{
> -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> +       return call_int_hook(capable, 0, cred, ns, cap, opts);
>  }
>
>  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index a67459eb62d5..abcee2874bad 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
>
>  /* Check whether a task is allowed to use a capability. */
>  static int cred_has_capability(const struct cred *cred,
> -                              int cap, int audit, bool initns)
> +                              int cap, unsigned int opts, bool initns)
>  {
>         struct common_audit_data ad;
>         struct av_decision avd;
> @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
>
>         rc = avc_has_perm_noaudit(&selinux_state,
>                                   sid, sid, sclass, av, 0, &avd);
> -       if (audit == SECURITY_CAP_AUDIT) {
> +       if (!(opts & CAP_OPT_NOAUDIT)) {
>                 int rc2 = avc_audit(&selinux_state,
>                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
>                 if (rc2)
> @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
>   */
>
>  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> -                          int cap, int audit)
> +                          int cap, unsigned int opts)
>  {
> -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
>  }
>
>  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> @@ -2392,7 +2392,7 @@ static int selinux_vm_enough_memory(struct mm_struct *mm, long pages)
>         int rc, cap_sys_admin = 0;
>
>         rc = cred_has_capability(current_cred(), CAP_SYS_ADMIN,
> -                                SECURITY_CAP_NOAUDIT, true);
> +                                CAP_OPT_NOAUDIT, true);
>         if (rc == 0)
>                 cap_sys_admin = 1;
>
> @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
>  static bool has_cap_mac_admin(bool audit)
>  {
>         const struct cred *cred = current_cred();
> -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> +       unsigned int opts = audit ? CAP_OPT_NONE : CAP_OPT_NOAUDIT;
>
> -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
>                 return false;
> -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
>                 return false;
>         return true;
>  }
> @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
>         case KDSKBENT:
>         case KDSKBSENT:
>                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> -                                           SECURITY_CAP_AUDIT, true);
> +                                           CAP_OPT_NONE, true);
>                 break;
>
>         /* default case assumes that the command will go
> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> index 9a4c0ad46518..ae6c994d11d0 100644
> --- a/security/smack/smack_access.c
> +++ b/security/smack/smack_access.c
> @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
>         struct smack_known_list_elem *sklep;
>         int rc;
>
> -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> +       rc = cap_capable(cred, &init_user_ns, cap, CAP_OPT_NONE);
>         if (rc)
>                 return false;
>
> --
> 2.20.1.97.g81188d93c3-goog
>


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4] LSM: generalize flag passing to security_capable
  2019-01-08  0:20               ` Kees Cook
@ 2019-01-09 18:39                 ` Micah Morton
  0 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-09 18:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

Any further comments on this since Kees' review? If not, seems like it
should be ready for a merge?

On Mon, Jan 7, 2019 at 4:20 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Jan 7, 2019 at 4:11 PM <mortonm@chromium.org> wrote:
> >
> > From: Micah Morton <mortonm@chromium.org>
> >
> > This patch provides a general mechanism for passing flags to the
> > security_capable LSM hook. It replaces the specific 'audit' flag that is
> > used to tell security_capable whether it should log an audit message for
> > the given capability check. The reason for generalizing this flag
> > passing is so we can add an additional flag that signifies whether
> > security_capable is being called by a setid syscall (which is needed by
> > the proposed SafeSetID LSM).
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
>
> -Kees
>
> > ---
> > Changes since the last patch: Changed the names of SECURITY_CAP_* to
> > CAP_OPT_* and started using the BIT() macro in the definition of the
> > bit fields. This v4 patch, like the v2 one, removes the
> > security_capable_noaudit function (since it seems like we're leaning
> > toward that option).
> >
> >  include/linux/lsm_hooks.h              |  8 +++++---
> >  include/linux/security.h               | 28 +++++++++++++-------------
> >  kernel/capability.c                    | 22 +++++++++++---------
> >  kernel/seccomp.c                       |  4 ++--
> >  security/apparmor/capability.c         | 14 ++++++-------
> >  security/apparmor/include/capability.h |  2 +-
> >  security/apparmor/ipc.c                |  3 ++-
> >  security/apparmor/lsm.c                |  4 ++--
> >  security/apparmor/resource.c           |  2 +-
> >  security/commoncap.c                   | 17 ++++++++--------
> >  security/security.c                    | 14 +++++--------
> >  security/selinux/hooks.c               | 18 ++++++++---------
> >  security/smack/smack_access.c          |  2 +-
> >  13 files changed, 71 insertions(+), 67 deletions(-)
> >
> > diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> > index aaeb7fa24dc4..ef955a44a782 100644
> > --- a/include/linux/lsm_hooks.h
> > +++ b/include/linux/lsm_hooks.h
> > @@ -1270,7 +1270,7 @@
> >   *     @cred contains the credentials to use.
> >   *     @ns contains the user namespace we want the capability in
> >   *     @cap contains the capability <include/linux/capability.h>.
> > - *     @audit contains whether to write an audit message or not
> > + *     @opts contains options for the capable check <include/linux/security.h>
> >   *     Return 0 if the capability is granted for @tsk.
> >   * @syslog:
> >   *     Check permission before accessing the kernel message ring or changing
> > @@ -1446,8 +1446,10 @@ union security_list_options {
> >                         const kernel_cap_t *effective,
> >                         const kernel_cap_t *inheritable,
> >                         const kernel_cap_t *permitted);
> > -       int (*capable)(const struct cred *cred, struct user_namespace *ns,
> > -                       int cap, int audit);
> > +       int (*capable)(const struct cred *cred,
> > +                       struct user_namespace *ns,
> > +                       int cap,
> > +                       unsigned int opts);
> >         int (*quotactl)(int cmds, int type, int id, struct super_block *sb);
> >         int (*quota_on)(struct dentry *dentry);
> >         int (*syslog)(int type);
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index d170a5b031f3..0fe246bfd380 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -54,9 +54,12 @@ struct xattr;
> >  struct xfrm_sec_ctx;
> >  struct mm_struct;
> >
> > +/* Default (no) options for the capable function */
> > +#define CAP_OPT_NONE 0x0
> >  /* If capable should audit the security request */
> > -#define SECURITY_CAP_NOAUDIT 0
> > -#define SECURITY_CAP_AUDIT 1
> > +#define CAP_OPT_NOAUDIT BIT(1)
> > +/* If capable is being called by a setid function */
> > +#define CAP_OPT_INSETID BIT(2)
> >
> >  /* LSM Agnostic defines for sb_set_mnt_opts */
> >  #define SECURITY_LSM_NATIVE_LABELS     1
> > @@ -72,7 +75,7 @@ enum lsm_event {
> >
> >  /* These functions are in security/commoncap.c */
> >  extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
> > -                      int cap, int audit);
> > +                      int cap, unsigned int opts);
> >  extern int cap_settime(const struct timespec64 *ts, const struct timezone *tz);
> >  extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
> >  extern int cap_ptrace_traceme(struct task_struct *parent);
> > @@ -233,10 +236,10 @@ int security_capset(struct cred *new, const struct cred *old,
> >                     const kernel_cap_t *effective,
> >                     const kernel_cap_t *inheritable,
> >                     const kernel_cap_t *permitted);
> > -int security_capable(const struct cred *cred, struct user_namespace *ns,
> > -                       int cap);
> > -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> > -                            int cap);
> > +int security_capable(const struct cred *cred,
> > +                      struct user_namespace *ns,
> > +                      int cap,
> > +                      unsigned int opts);
> >  int security_quotactl(int cmds, int type, int id, struct super_block *sb);
> >  int security_quota_on(struct dentry *dentry);
> >  int security_syslog(int type);
> > @@ -492,14 +495,11 @@ static inline int security_capset(struct cred *new,
> >  }
> >
> >  static inline int security_capable(const struct cred *cred,
> > -                                  struct user_namespace *ns, int cap)
> > +                                  struct user_namespace *ns,
> > +                                  int cap,
> > +                                  unsigned int opts)
> >  {
> > -       return cap_capable(cred, ns, cap, SECURITY_CAP_AUDIT);
> > -}
> > -
> > -static inline int security_capable_noaudit(const struct cred *cred,
> > -                                          struct user_namespace *ns, int cap) {
> > -       return cap_capable(cred, ns, cap, SECURITY_CAP_NOAUDIT);
> > +       return cap_capable(cred, ns, cap, opts);
> >  }
> >
> >  static inline int security_quotactl(int cmds, int type, int id,
> > diff --git a/kernel/capability.c b/kernel/capability.c
> > index 1e1c0236f55b..7718d7dcadc7 100644
> > --- a/kernel/capability.c
> > +++ b/kernel/capability.c
> > @@ -299,7 +299,7 @@ bool has_ns_capability(struct task_struct *t,
> >         int ret;
> >
> >         rcu_read_lock();
> > -       ret = security_capable(__task_cred(t), ns, cap);
> > +       ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NONE);
> >         rcu_read_unlock();
> >
> >         return (ret == 0);
> > @@ -340,7 +340,7 @@ bool has_ns_capability_noaudit(struct task_struct *t,
> >         int ret;
> >
> >         rcu_read_lock();
> > -       ret = security_capable_noaudit(__task_cred(t), ns, cap);
> > +       ret = security_capable(__task_cred(t), ns, cap, CAP_OPT_NOAUDIT);
> >         rcu_read_unlock();
> >
> >         return (ret == 0);
> > @@ -363,7 +363,9 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
> >         return has_ns_capability_noaudit(t, &init_user_ns, cap);
> >  }
> >
> > -static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> > +static bool ns_capable_common(struct user_namespace *ns,
> > +                             int cap,
> > +                             unsigned int opts)
> >  {
> >         int capable;
> >
> > @@ -372,8 +374,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >                 BUG();
> >         }
> >
> > -       capable = audit ? security_capable(current_cred(), ns, cap) :
> > -                         security_capable_noaudit(current_cred(), ns, cap);
> > +       capable = security_capable(current_cred(), ns, cap, opts);
> >         if (capable == 0) {
> >                 current->flags |= PF_SUPERPRIV;
> >                 return true;
> > @@ -394,7 +395,7 @@ static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
> >   */
> >  bool ns_capable(struct user_namespace *ns, int cap)
> >  {
> > -       return ns_capable_common(ns, cap, true);
> > +       return ns_capable_common(ns, cap, CAP_OPT_NONE);
> >  }
> >  EXPORT_SYMBOL(ns_capable);
> >
> > @@ -412,7 +413,7 @@ EXPORT_SYMBOL(ns_capable);
> >   */
> >  bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> >  {
> > -       return ns_capable_common(ns, cap, false);
> > +       return ns_capable_common(ns, cap, CAP_OPT_NOAUDIT);
> >  }
> >  EXPORT_SYMBOL(ns_capable_noaudit);
> >
> > @@ -448,10 +449,11 @@ EXPORT_SYMBOL(capable);
> >  bool file_ns_capable(const struct file *file, struct user_namespace *ns,
> >                      int cap)
> >  {
> > +
> >         if (WARN_ON_ONCE(!cap_valid(cap)))
> >                 return false;
> >
> > -       if (security_capable(file->f_cred, ns, cap) == 0)
> > +       if (security_capable(file->f_cred, ns, cap, CAP_OPT_NONE) == 0)
> >                 return true;
> >
> >         return false;
> > @@ -500,10 +502,12 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns)
> >  {
> >         int ret = 0;  /* An absent tracer adds no restrictions */
> >         const struct cred *cred;
> > +
> >         rcu_read_lock();
> >         cred = rcu_dereference(tsk->ptracer_cred);
> >         if (cred)
> > -               ret = security_capable_noaudit(cred, ns, CAP_SYS_PTRACE);
> > +               ret = security_capable(cred, ns, CAP_SYS_PTRACE,
> > +                                      CAP_OPT_NOAUDIT);
> >         rcu_read_unlock();
> >         return (ret == 0);
> >  }
> > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > index f2ae2324c232..2289c0befc08 100644
> > --- a/kernel/seccomp.c
> > +++ b/kernel/seccomp.c
> > @@ -383,8 +383,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >          * behavior of privileged children.
> >          */
> >         if (!task_no_new_privs(current) &&
> > -           security_capable_noaudit(current_cred(), current_user_ns(),
> > -                                    CAP_SYS_ADMIN) != 0)
> > +           security_capable(current_cred(), current_user_ns(),
> > +                                    CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) != 0)
> >                 return ERR_PTR(-EACCES);
> >
> >         /* Allocate a new seccomp_filter */
> > diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> > index 253ef6e9d445..752f73980e30 100644
> > --- a/security/apparmor/capability.c
> > +++ b/security/apparmor/capability.c
> > @@ -110,13 +110,13 @@ static int audit_caps(struct common_audit_data *sa, struct aa_profile *profile,
> >   * profile_capable - test if profile allows use of capability @cap
> >   * @profile: profile being enforced    (NOT NULL, NOT unconfined)
> >   * @cap: capability to test if allowed
> > - * @audit: whether an audit record should be generated
> > + * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
> >   * @sa: audit data (MAY BE NULL indicating no auditing)
> >   *
> >   * Returns: 0 if allowed else -EPERM
> >   */
> > -static int profile_capable(struct aa_profile *profile, int cap, int audit,
> > -                          struct common_audit_data *sa)
> > +static int profile_capable(struct aa_profile *profile, int cap,
> > +                          unsigned int opts, struct common_audit_data *sa)
> >  {
> >         int error;
> >
> > @@ -126,7 +126,7 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >         else
> >                 error = -EPERM;
> >
> > -       if (audit == SECURITY_CAP_NOAUDIT) {
> > +       if (opts & CAP_OPT_NOAUDIT) {
> >                 if (!COMPLAIN_MODE(profile))
> >                         return error;
> >                 /* audit the cap request in complain mode but note that it
> > @@ -142,13 +142,13 @@ static int profile_capable(struct aa_profile *profile, int cap, int audit,
> >   * aa_capable - test permission to use capability
> >   * @label: label being tested for capability (NOT NULL)
> >   * @cap: capability to be tested
> > - * @audit: whether an audit record should be generated
> > + * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
> >   *
> >   * Look up capability in profile capability set.
> >   *
> >   * Returns: 0 on success, or else an error code.
> >   */
> > -int aa_capable(struct aa_label *label, int cap, int audit)
> > +int aa_capable(struct aa_label *label, int cap, unsigned int opts)
> >  {
> >         struct aa_profile *profile;
> >         int error = 0;
> > @@ -156,7 +156,7 @@ int aa_capable(struct aa_label *label, int cap, int audit)
> >
> >         sa.u.cap = cap;
> >         error = fn_for_each_confined(label, profile,
> > -                       profile_capable(profile, cap, audit, &sa));
> > +                       profile_capable(profile, cap, opts, &sa));
> >
> >         return error;
> >  }
> > diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
> > index e0304e2aeb7f..1b3663b6ab12 100644
> > --- a/security/apparmor/include/capability.h
> > +++ b/security/apparmor/include/capability.h
> > @@ -40,7 +40,7 @@ struct aa_caps {
> >
> >  extern struct aa_sfs_entry aa_sfs_entry_caps[];
> >
> > -int aa_capable(struct aa_label *label, int cap, int audit);
> > +int aa_capable(struct aa_label *label, int cap, unsigned int opts);
> >
> >  static inline void aa_free_cap_rules(struct aa_caps *caps)
> >  {
> > diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> > index 527ea1557120..aacd1e95cb59 100644
> > --- a/security/apparmor/ipc.c
> > +++ b/security/apparmor/ipc.c
> > @@ -107,7 +107,8 @@ static int profile_tracer_perm(struct aa_profile *tracer,
> >         aad(sa)->label = &tracer->label;
> >         aad(sa)->peer = tracee;
> >         aad(sa)->request = 0;
> > -       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
> > +       aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE,
> > +                                   CAP_OPT_NONE);
> >
> >         return aa_audit(AUDIT_APPARMOR_AUTO, tracer, sa, audit_ptrace_cb);
> >  }
> > diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> > index 42446a216f3b..0bd817084fc1 100644
> > --- a/security/apparmor/lsm.c
> > +++ b/security/apparmor/lsm.c
> > @@ -176,14 +176,14 @@ static int apparmor_capget(struct task_struct *target, kernel_cap_t *effective,
> >  }
> >
> >  static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
> > -                           int cap, int audit)
> > +                           int cap, unsigned int opts)
> >  {
> >         struct aa_label *label;
> >         int error = 0;
> >
> >         label = aa_get_newest_cred_label(cred);
> >         if (!unconfined(label))
> > -               error = aa_capable(label, cap, audit);
> > +               error = aa_capable(label, cap, opts);
> >         aa_put_label(label);
> >
> >         return error;
> > diff --git a/security/apparmor/resource.c b/security/apparmor/resource.c
> > index 95fd26d09757..552ed09cb47e 100644
> > --- a/security/apparmor/resource.c
> > +++ b/security/apparmor/resource.c
> > @@ -124,7 +124,7 @@ int aa_task_setrlimit(struct aa_label *label, struct task_struct *task,
> >          */
> >
> >         if (label != peer &&
> > -           aa_capable(label, CAP_SYS_RESOURCE, SECURITY_CAP_NOAUDIT) != 0)
> > +           aa_capable(label, CAP_SYS_RESOURCE, CAP_OPT_NOAUDIT) != 0)
> >                 error = fn_for_each(label, profile,
> >                                 audit_resource(profile, resource,
> >                                                new_rlim->rlim_max, peer,
> > diff --git a/security/commoncap.c b/security/commoncap.c
> > index 232db019f051..13f03622f694 100644
> > --- a/security/commoncap.c
> > +++ b/security/commoncap.c
> > @@ -68,7 +68,7 @@ static void warn_setuid_and_fcaps_mixed(const char *fname)
> >   * kernel's capable() and has_capability() returns 1 for this case.
> >   */
> >  int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
> > -               int cap, int audit)
> > +               int cap, unsigned int opts)
> >  {
> >         struct user_namespace *ns = targ_ns;
> >
> > @@ -222,12 +222,11 @@ int cap_capget(struct task_struct *target, kernel_cap_t *effective,
> >   */
> >  static inline int cap_inh_is_capped(void)
> >  {
> > -
> >         /* they are so limited unless the current task has the CAP_SETPCAP
> >          * capability
> >          */
> >         if (cap_capable(current_cred(), current_cred()->user_ns,
> > -                       CAP_SETPCAP, SECURITY_CAP_AUDIT) == 0)
> > +                       CAP_SETPCAP, CAP_OPT_NONE) == 0)
> >                 return 0;
> >         return 1;
> >  }
> > @@ -1208,8 +1207,9 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >                     || ((old->securebits & SECURE_ALL_LOCKS & ~arg2))   /*[2]*/
> >                     || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS))   /*[3]*/
> >                     || (cap_capable(current_cred(),
> > -                                   current_cred()->user_ns, CAP_SETPCAP,
> > -                                   SECURITY_CAP_AUDIT) != 0)           /*[4]*/
> > +                                   current_cred()->user_ns,
> > +                                   CAP_SETPCAP,
> > +                                   CAP_OPT_NONE) != 0)                 /*[4]*/
> >                         /*
> >                          * [1] no changing of bits that are locked
> >                          * [2] no unlocking of locks
> > @@ -1304,9 +1304,10 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages)
> >  {
> >         int cap_sys_admin = 0;
> >
> > -       if (cap_capable(current_cred(), &init_user_ns, CAP_SYS_ADMIN,
> > -                       SECURITY_CAP_NOAUDIT) == 0)
> > +       if (cap_capable(current_cred(), &init_user_ns,
> > +                               CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) == 0)
> >                 cap_sys_admin = 1;
> > +
> >         return cap_sys_admin;
> >  }
> >
> > @@ -1325,7 +1326,7 @@ int cap_mmap_addr(unsigned long addr)
> >
> >         if (addr < dac_mmap_min_addr) {
> >                 ret = cap_capable(current_cred(), &init_user_ns, CAP_SYS_RAWIO,
> > -                                 SECURITY_CAP_AUDIT);
> > +                                 CAP_OPT_NONE);
> >                 /* set PF_SUPERPRIV if it turns out we allow the low mmap */
> >                 if (ret == 0)
> >                         current->flags |= PF_SUPERPRIV;
> > diff --git a/security/security.c b/security/security.c
> > index d670136dda2c..d2334697797a 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -294,16 +294,12 @@ int security_capset(struct cred *new, const struct cred *old,
> >                                 effective, inheritable, permitted);
> >  }
> >
> > -int security_capable(const struct cred *cred, struct user_namespace *ns,
> > -                    int cap)
> > +int security_capable(const struct cred *cred,
> > +                    struct user_namespace *ns,
> > +                    int cap,
> > +                    unsigned int opts)
> >  {
> > -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_AUDIT);
> > -}
> > -
> > -int security_capable_noaudit(const struct cred *cred, struct user_namespace *ns,
> > -                            int cap)
> > -{
> > -       return call_int_hook(capable, 0, cred, ns, cap, SECURITY_CAP_NOAUDIT);
> > +       return call_int_hook(capable, 0, cred, ns, cap, opts);
> >  }
> >
> >  int security_quotactl(int cmds, int type, int id, struct super_block *sb)
> > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > index a67459eb62d5..abcee2874bad 100644
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -1769,7 +1769,7 @@ static inline u32 signal_to_av(int sig)
> >
> >  /* Check whether a task is allowed to use a capability. */
> >  static int cred_has_capability(const struct cred *cred,
> > -                              int cap, int audit, bool initns)
> > +                              int cap, unsigned int opts, bool initns)
> >  {
> >         struct common_audit_data ad;
> >         struct av_decision avd;
> > @@ -1796,7 +1796,7 @@ static int cred_has_capability(const struct cred *cred,
> >
> >         rc = avc_has_perm_noaudit(&selinux_state,
> >                                   sid, sid, sclass, av, 0, &avd);
> > -       if (audit == SECURITY_CAP_AUDIT) {
> > +       if (!(opts & CAP_OPT_NOAUDIT)) {
> >                 int rc2 = avc_audit(&selinux_state,
> >                                     sid, sid, sclass, av, &avd, rc, &ad, 0);
> >                 if (rc2)
> > @@ -2316,9 +2316,9 @@ static int selinux_capset(struct cred *new, const struct cred *old,
> >   */
> >
> >  static int selinux_capable(const struct cred *cred, struct user_namespace *ns,
> > -                          int cap, int audit)
> > +                          int cap, unsigned int opts)
> >  {
> > -       return cred_has_capability(cred, cap, audit, ns == &init_user_ns);
> > +       return cred_has_capability(cred, cap, opts, ns == &init_user_ns);
> >  }
> >
> >  static int selinux_quotactl(int cmds, int type, int id, struct super_block *sb)
> > @@ -2392,7 +2392,7 @@ static int selinux_vm_enough_memory(struct mm_struct *mm, long pages)
> >         int rc, cap_sys_admin = 0;
> >
> >         rc = cred_has_capability(current_cred(), CAP_SYS_ADMIN,
> > -                                SECURITY_CAP_NOAUDIT, true);
> > +                                CAP_OPT_NOAUDIT, true);
> >         if (rc == 0)
> >                 cap_sys_admin = 1;
> >
> > @@ -3245,11 +3245,11 @@ static int selinux_inode_getattr(const struct path *path)
> >  static bool has_cap_mac_admin(bool audit)
> >  {
> >         const struct cred *cred = current_cred();
> > -       int cap_audit = audit ? SECURITY_CAP_AUDIT : SECURITY_CAP_NOAUDIT;
> > +       unsigned int opts = audit ? CAP_OPT_NONE : CAP_OPT_NOAUDIT;
> >
> > -       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, cap_audit))
> > +       if (cap_capable(cred, &init_user_ns, CAP_MAC_ADMIN, opts))
> >                 return false;
> > -       if (cred_has_capability(cred, CAP_MAC_ADMIN, cap_audit, true))
> > +       if (cred_has_capability(cred, CAP_MAC_ADMIN, opts, true))
> >                 return false;
> >         return true;
> >  }
> > @@ -3649,7 +3649,7 @@ static int selinux_file_ioctl(struct file *file, unsigned int cmd,
> >         case KDSKBENT:
> >         case KDSKBSENT:
> >                 error = cred_has_capability(cred, CAP_SYS_TTY_CONFIG,
> > -                                           SECURITY_CAP_AUDIT, true);
> > +                                           CAP_OPT_NONE, true);
> >                 break;
> >
> >         /* default case assumes that the command will go
> > diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> > index 9a4c0ad46518..ae6c994d11d0 100644
> > --- a/security/smack/smack_access.c
> > +++ b/security/smack/smack_access.c
> > @@ -640,7 +640,7 @@ bool smack_privileged_cred(int cap, const struct cred *cred)
> >         struct smack_known_list_elem *sklep;
> >         int rc;
> >
> > -       rc = cap_capable(cred, &init_user_ns, cap, SECURITY_CAP_AUDIT);
> > +       rc = cap_capable(cred, &init_user_ns, cap, CAP_OPT_NONE);
> >         if (rc)
> >                 return false;
> >
> > --
> > 2.20.1.97.g81188d93c3-goog
> >
>
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4] LSM: generalize flag passing to security_capable
  2019-01-08  0:10             ` [PATCH v4] " mortonm
  2019-01-08  0:20               ` Kees Cook
@ 2019-01-10 22:31               ` James Morris
  2019-01-10 23:03                 ` Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: James Morris @ 2019-01-10 22:31 UTC (permalink / raw)
  To: Micah Morton; +Cc: serge, keescook, casey, sds, linux-security-module

On Mon, 7 Jan 2019, mortonm@chromium.org wrote:

> From: Micah Morton <mortonm@chromium.org>
> 
> This patch provides a general mechanism for passing flags to the
> security_capable LSM hook. It replaces the specific 'audit' flag that is
> used to tell security_capable whether it should log an audit message for
> the given capability check. The reason for generalizing this flag
> passing is so we can add an additional flag that signifies whether
> security_capable is being called by a setid syscall (which is needed by
> the proposed SafeSetID LSM).
> 
> Signed-off-by: Micah Morton <mortonm@chromium.org>

Applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general
and next-testing

-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4] LSM: generalize flag passing to security_capable
  2019-01-10 22:31               ` James Morris
@ 2019-01-10 23:03                 ` Micah Morton
  0 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-10 23:03 UTC (permalink / raw)
  To: James Morris
  Cc: Serge E. Hallyn, Kees Cook, Casey Schaufler, Stephen Smalley,
	linux-security-module

Sounds good, thanks!

On Thu, Jan 10, 2019 at 2:31 PM James Morris <jmorris@namei.org> wrote:
>
> On Mon, 7 Jan 2019, mortonm@chromium.org wrote:
>
> > From: Micah Morton <mortonm@chromium.org>
> >
> > This patch provides a general mechanism for passing flags to the
> > security_capable LSM hook. It replaces the specific 'audit' flag that is
> > used to tell security_capable whether it should log an audit message for
> > the given capability check. The reason for generalizing this flag
> > passing is so we can add an additional flag that signifies whether
> > security_capable is being called by a setid syscall (which is needed by
> > the proposed SafeSetID LSM).
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
>
> Applied to
> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general
> and next-testing
>
> --
> James Morris
> <jmorris@namei.org>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2018-12-06  0:08                               ` Kees Cook
  2018-12-06 17:51                                 ` Micah Morton
@ 2019-01-11 17:13                                 ` mortonm
  2019-01-15  0:38                                   ` Kees Cook
  2019-01-15  4:07                                   ` James Morris
  1 sibling, 2 replies; 88+ messages in thread
From: mortonm @ 2019-01-11 17:13 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 30105 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Changes since the last patch set: Rebase after commit
a35ce66b801631823fc78c8a78d104f9c0976867 got applied to next-general.
As a result of that commit, we can remove the changes in arch/ and the
setuid_syscall function in lsm.c, since this code no longer needs to do
arch-specific operations to see if security_capable is being called from
a setid syscall. Instead, we add the ns_capable_insetid function and
call it from the setid syscalls in kernel/sys.c (rather than calling the
original ns_capable function), which allows us to signal to the
security_capable hooks whether the hook is being called from within a
setid syscall.

 Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 include/linux/capability.h                  |   5 +
 kernel/capability.c                         |  19 ++
 kernel/sys.c                                |  10 +-
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  12 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 272 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 +++
 security/safesetid/securityfs.c             | 189 ++++++++++++++
 12 files changed, 650 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..ffb64be67f7a
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,107 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+The main use case for this LSM is to allow a non-root program to transition to
+other untrusted uids without full blown CAP_SETUID capabilities. The non-root
+program would still need CAP_SETUID to do any kind of transition, but the
+additional restrictions imposed by this LSM would mean it is a "safer" version
+of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
+do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
+namespace). The higher level goal is to allow for uid-based sandboxing of system
+services without having to give out CAP_SETUID all over the place just so that
+non-root programs can drop to even-lesser-privileged uids. This is especially
+relevant when one non-root daemon on the system should be allowed to spawn other
+processes as different uids, but its undesirable to give the daemon a
+basically-root-equivalent CAP_SETUID.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index 9842e21afd4a..a6ba95fbaa9f 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -46,3 +46,4 @@ subdirectories.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/include/linux/capability.h b/include/linux/capability.h
index f640dcbc880c..f4771974dcfc 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -209,6 +209,7 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
 extern bool capable(int cap);
 extern bool ns_capable(struct user_namespace *ns, int cap);
 extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
+extern bool ns_capable_insetid(struct user_namespace *ns, int cap);
 #else
 static inline bool has_capability(struct task_struct *t, int cap)
 {
@@ -240,6 +241,10 @@ static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
 	return true;
 }
+static inline bool ns_capable_insetid(struct user_namespace *ns, int cap)
+{
+	return true;
+}
 #endif /* CONFIG_MULTIUSER */
 extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
diff --git a/kernel/capability.c b/kernel/capability.c
index 7718d7dcadc7..184f544fb7b1 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -417,6 +417,25 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
+/**
+ * ns_capable_insetid - Determine if the current task has a superior capability
+ * in effect, while signalling that this check is being done from within a
+ * setid syscall.
+ * @ns:  The usernamespace we want the capability in
+ * @cap: The capability to be tested for
+ *
+ * Return true if the current task has the given superior capability currently
+ * available for use, false if not.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool ns_capable_insetid(struct user_namespace *ns, int cap)
+{
+	return ns_capable_common(ns, cap, CAP_OPT_INSETID);
+}
+EXPORT_SYMBOL(ns_capable_insetid);
+
 /**
  * capable - Determine if the current task has a superior capability in effect
  * @cap: The capability to be tested for
diff --git a/kernel/sys.c b/kernel/sys.c
index a48cbf1414b8..d31bf3a2afc2 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -516,7 +516,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
 		new->uid = kruid;
 		if (!uid_eq(old->uid, kruid) &&
 		    !uid_eq(old->euid, kruid) &&
-		    !ns_capable(old->user_ns, CAP_SETUID))
+		    !ns_capable_insetid(old->user_ns, CAP_SETUID))
 			goto error;
 	}
 
@@ -525,7 +525,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
 		if (!uid_eq(old->uid, keuid) &&
 		    !uid_eq(old->euid, keuid) &&
 		    !uid_eq(old->suid, keuid) &&
-		    !ns_capable(old->user_ns, CAP_SETUID))
+		    !ns_capable_insetid(old->user_ns, CAP_SETUID))
 			goto error;
 	}
 
@@ -584,7 +584,7 @@ long __sys_setuid(uid_t uid)
 	old = current_cred();
 
 	retval = -EPERM;
-	if (ns_capable(old->user_ns, CAP_SETUID)) {
+	if (ns_capable_insetid(old->user_ns, CAP_SETUID)) {
 		new->suid = new->uid = kuid;
 		if (!uid_eq(kuid, old->uid)) {
 			retval = set_user(new);
@@ -646,7 +646,7 @@ long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid)
 	old = current_cred();
 
 	retval = -EPERM;
-	if (!ns_capable(old->user_ns, CAP_SETUID)) {
+	if (!ns_capable_insetid(old->user_ns, CAP_SETUID)) {
 		if (ruid != (uid_t) -1        && !uid_eq(kruid, old->uid) &&
 		    !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid))
 			goto error;
@@ -814,7 +814,7 @@ long __sys_setfsuid(uid_t uid)
 
 	if (uid_eq(kuid, old->uid)  || uid_eq(kuid, old->euid)  ||
 	    uid_eq(kuid, old->suid) || uid_eq(kuid, old->fsuid) ||
-	    ns_capable(old->user_ns, CAP_SETUID)) {
+	    ns_capable_insetid(old->user_ns, CAP_SETUID)) {
 		if (!uid_eq(kuid, old->fsuid)) {
 			new->fsuid = kuid;
 			if (security_task_fix_setuid(new, old, LSM_SETID_FS) == 0)
diff --git a/security/Kconfig b/security/Kconfig
index 78dc12b7eeb3..9efc7a5e3280 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -236,6 +236,7 @@ source "security/tomoyo/Kconfig"
 source "security/apparmor/Kconfig"
 source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
+source "security/safesetid/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..c598b904938f 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..bf89a47ffcc8
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,12 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..a1721ed85544
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,272 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      unsigned int opts)
+{
+	if (cap == CAP_SETUID &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		if (!(opts & CAP_OPT_INSETID)) {
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
+				__kuid_val(cred->uid));
+			return -1;
+                }
+	}
+	return 0;
+}
+
+static void setuid_policy_violation(kuid_t parent, kuid_t child)
+{
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+        /*
+         * Kill this process to avoid potential security vulnerabilities
+         * that could arise from a missing whitelist entry preventing a
+         * privileged process from dropping to a lesser-privileged one.
+         */
+        do_exit(SIGKILL);
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	setuid_policy_violation(parent, child);
+	return -1;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+
+DEFINE_LSM(safesetid_security_init) = {
+	.init = safesetid_security_init,
+};
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..ff5fcf2c1b37
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
+		flush_safesetid_whitelist_entries();
+		return len;
+	}
+
+	/*
+	 * If we get to here, must be the case that file_entry->type equals
+	 * SAFESETID_WHITELIST_ADD
+	 */
+	ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+							 &child);
+	if (ret)
+		return ret;
+
+	ret = add_safesetid_whitelist_entry(parent, child);
+	if (ret)
+		return ret;
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2019-01-11 17:13                                 ` [PATCH v2] " mortonm
@ 2019-01-15  0:38                                   ` Kees Cook
  2019-01-15 18:04                                     ` [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c mortonm
                                                       ` (2 more replies)
  2019-01-15  4:07                                   ` James Morris
  1 sibling, 3 replies; 88+ messages in thread
From: Kees Cook @ 2019-01-15  0:38 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Fri, Jan 11, 2019 at 9:13 AM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> Changes since the last patch set: Rebase after commit
> a35ce66b801631823fc78c8a78d104f9c0976867 got applied to next-general.
> As a result of that commit, we can remove the changes in arch/ and the
> setuid_syscall function in lsm.c, since this code no longer needs to do
> arch-specific operations to see if security_capable is being called from
> a setid syscall. Instead, we add the ns_capable_insetid function and
> call it from the setid syscalls in kernel/sys.c (rather than calling the
> original ns_capable function), which allows us to signal to the
> security_capable hooks whether the hook is being called from within a
> setid syscall.

I would split this patch into two halfs: the "no op" change that
"marks" all the setid call sites in the first patch, then the LSM
itself in the second patch.

> +bool ns_capable_insetid(struct user_namespace *ns, int cap)
> +{
> +       return ns_capable_common(ns, cap, CAP_OPT_INSETID);
> +}
> +EXPORT_SYMBOL(ns_capable_insetid);

Since we have the noaudit helper still, using this one seems fine to
me. I might bikeshed the name to "ns_capable_setid()". If others don't
want a new helper, then it should be fine to just change the callsites
to the direct ns_capable_common(ns, cap, CAP_OPT_INSETID).

> +static int safesetid_security_capable(const struct cred *cred,
> +                                     struct user_namespace *ns,
> +                                     int cap,
> +                                     unsigned int opts)
> +{
> +       if (cap == CAP_SETUID &&
> +           check_setuid_policy_hashtable_key(cred->uid)) {
> +               if (!(opts & CAP_OPT_INSETID)) {
> +                       /*
> +                        * Deny if we're not in a set*uid() syscall to avoid
> +                        * giving powers gated by CAP_SETUID that are related
> +                        * to functionality other than calling set*uid() (e.g.
> +                        * allowing user to set up userns uid mappings).
> +                        */
> +                       pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> +                               __kuid_val(cred->uid));
> +                       return -1;
> +                }
> +       }
> +       return 0;
> +}

Much cleaner than the per-arch syscall tests. :)

> +static void setuid_policy_violation(kuid_t parent, kuid_t child)
> +{
> +       pr_warn("UID transition (%d -> %d) blocked",
> +               __kuid_val(parent),
> +               __kuid_val(child));
> +        /*
> +         * Kill this process to avoid potential security vulnerabilities
> +         * that could arise from a missing whitelist entry preventing a
> +         * privileged process from dropping to a lesser-privileged one.
> +         */
> +        do_exit(SIGKILL);

I think I asked earlier if this should be an unblockable signal raise
instead of a do_exit(). I don't remember if that got answered?

> +}
> +
> +static int check_uid_transition(kuid_t parent, kuid_t child)
> +{
> +       if (check_setuid_policy_hashtable_key_value(parent, child))
> +               return 0;
> +       setuid_policy_violation(parent, child);
> +       return -1;
> +}

Any reason not to just collapse setuid_policy_violation() into this function?

> +
> +/*
> + * Check whether there is either an exception for user under old cred struct to
> + * set*uid to user under new cred struct, or the UID transition is allowed (by
> + * Linux set*uid rules) even without CAP_SETUID.
> + */
> +static int safesetid_task_fix_setuid(struct cred *new,
> +                                    const struct cred *old,
> +                                    int flags)
> +{
> +
> +       /* Do nothing if there are no setuid restrictions for this UID. */
> +       if (!check_setuid_policy_hashtable_key(old->uid))
> +               return 0;
> +
> +       switch (flags) {
> +       case LSM_SETID_RE:
> +               /*
> +                * Users for which setuid restrictions exist can only set the
> +                * real UID to the real UID or the effective UID, unless an
> +                * explicit whitelist policy allows the transition.
> +                */
> +               if (!uid_eq(old->uid, new->uid) &&
> +                       !uid_eq(old->euid, new->uid)) {
> +                       return check_uid_transition(old->uid, new->uid);
> +               }
> +               /*
> +                * Users for which setuid restrictions exist can only set the
> +                * effective UID to the real UID, the effective UID, or the
> +                * saved set-UID, unless an explicit whitelist policy allows
> +                * the transition.
> +                */
> +               if (!uid_eq(old->uid, new->euid) &&
> +                       !uid_eq(old->euid, new->euid) &&
> +                       !uid_eq(old->suid, new->euid)) {
> +                       return check_uid_transition(old->euid, new->euid);
> +               }
> +               break;
> +       case LSM_SETID_ID:
> +               /*
> +                * Users for which setuid restrictions exist cannot change the
> +                * real UID or saved set-UID unless an explicit whitelist
> +                * policy allows the transition.
> +                */
> +               if (!uid_eq(old->uid, new->uid))
> +                       return check_uid_transition(old->uid, new->uid);
> +               if (!uid_eq(old->suid, new->suid))
> +                       return check_uid_transition(old->suid, new->suid);
> +               break;
> +       case LSM_SETID_RES:
> +               /*
> +                * Users for which setuid restrictions exist cannot change the
> +                * real UID, effective UID, or saved set-UID to anything but
> +                * one of: the current real UID, the current effective UID or
> +                * the current saved set-user-ID unless an explicit whitelist
> +                * policy allows the transition.
> +                */
> +               if (!uid_eq(new->uid, old->uid) &&
> +                       !uid_eq(new->uid, old->euid) &&
> +                       !uid_eq(new->uid, old->suid)) {
> +                       return check_uid_transition(old->uid, new->uid);
> +               }
> +               if (!uid_eq(new->euid, old->uid) &&
> +                       !uid_eq(new->euid, old->euid) &&
> +                       !uid_eq(new->euid, old->suid)) {
> +                       return check_uid_transition(old->euid, new->euid);
> +               }
> +               if (!uid_eq(new->suid, old->uid) &&
> +                       !uid_eq(new->suid, old->euid) &&
> +                       !uid_eq(new->suid, old->suid)) {
> +                       return check_uid_transition(old->suid, new->suid);
> +               }
> +               break;
> +       case LSM_SETID_FS:
> +               /*
> +                * Users for which setuid restrictions exist cannot change the
> +                * filesystem UID to anything but one of: the current real UID,
> +                * the current effective UID or the current saved set-UID
> +                * unless an explicit whitelist policy allows the transition.
> +                */
> +               if (!uid_eq(new->fsuid, old->uid)  &&
> +                       !uid_eq(new->fsuid, old->euid)  &&
> +                       !uid_eq(new->fsuid, old->suid) &&
> +                       !uid_eq(new->fsuid, old->fsuid)) {
> +                       return check_uid_transition(old->fsuid, new->fsuid);
> +               }
> +               break;
> +       }
> +       return 0;
> +}
> +
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> +{
> +       struct entry *new;
> +
> +       /* Return if entry already exists */
> +       if (check_setuid_policy_hashtable_key_value(parent, child))
> +               return 0;
> +
> +       new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> +       if (!new)
> +               return -ENOMEM;
> +       new->parent_kuid = __kuid_val(parent);
> +       new->child_kuid = __kuid_val(child);
> +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +       hash_add_rcu(safesetid_whitelist_hashtable,
> +                    &new->next,
> +                    __kuid_val(parent));
> +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +       return 0;
> +}
> +
> +void flush_safesetid_whitelist_entries(void)
> +{
> +       struct entry *entry;
> +       struct hlist_node *hlist_node;
> +       unsigned int bkt_loop_cursor;
> +       HLIST_HEAD(free_list);
> +
> +       /*
> +        * Could probably use hash_for_each_rcu here instead, but this should
> +        * be fine as well.
> +        */
> +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +       hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> +                          hlist_node, entry, next) {
> +               hash_del_rcu(&entry->next);
> +               hlist_add_head(&entry->dlist, &free_list);
> +       }
> +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +       synchronize_rcu();
> +       hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> +               hlist_del(&entry->dlist);
> +               kfree(entry);
> +       }
> +}
> +
> +static struct security_hook_list safesetid_security_hooks[] = {
> +       LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> +       LSM_HOOK_INIT(capable, safesetid_security_capable)
> +};
> +
> +static int __init safesetid_security_init(void)
> +{
> +       security_add_hooks(safesetid_security_hooks,
> +                          ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> +
> +       return 0;
> +}
> +
> +DEFINE_LSM(safesetid_security_init) = {
> +       .init = safesetid_security_init,
> +};
> diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> new file mode 100644
> index 000000000000..bf78af9bf314
> --- /dev/null
> +++ b/security/safesetid/lsm.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#ifndef _SAFESETID_H
> +#define _SAFESETID_H
> +
> +#include <linux/types.h>
> +
> +/* Function type. */
> +enum safesetid_whitelist_file_write_type {
> +       SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> +       SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> +};
> +
> +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> +
> +void flush_safesetid_whitelist_entries(void);
> +
> +#endif /* _SAFESETID_H */
> diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> new file mode 100644
> index 000000000000..ff5fcf2c1b37
> --- /dev/null
> +++ b/security/safesetid/securityfs.c
> @@ -0,0 +1,189 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#include <linux/security.h>
> +#include <linux/cred.h>
> +
> +#include "lsm.h"
> +
> +static struct dentry *safesetid_policy_dir;
> +
> +struct safesetid_file_entry {
> +       const char *name;
> +       enum safesetid_whitelist_file_write_type type;
> +       struct dentry *dentry;
> +};
> +
> +static struct safesetid_file_entry safesetid_files[] = {
> +       {.name = "add_whitelist_policy",
> +        .type = SAFESETID_WHITELIST_ADD},
> +       {.name = "flush_whitelist_policies",
> +        .type = SAFESETID_WHITELIST_FLUSH},
> +};
> +
> +/*
> + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> + * variables pointed to by 'parent' and 'child' will get updated but this
> + * function will return an error.
> + */
> +static int parse_safesetid_whitelist_policy(const char __user *buf,
> +                                           size_t len,
> +                                           kuid_t *parent,
> +                                           kuid_t *child)
> +{
> +       char *kern_buf;
> +       char *parent_buf;
> +       char *child_buf;
> +       const char separator[] = ":";
> +       int ret;
> +       size_t first_substring_length;
> +       long parsed_parent;
> +       long parsed_child;
> +
> +       /* Duplicate string from user memory and NULL-terminate */
> +       kern_buf = memdup_user_nul(buf, len);
> +       if (IS_ERR(kern_buf))
> +               return PTR_ERR(kern_buf);
> +
> +       /*
> +        * Format of |buf| string should be <UID>:<UID>.
> +        * Find location of ":" in kern_buf (copied from |buf|).
> +        */
> +       first_substring_length = strcspn(kern_buf, separator);
> +       if (first_substring_length == 0 || first_substring_length == len) {
> +               ret = -EINVAL;
> +               goto free_kern;
> +       }
> +
> +       parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> +       if (!parent_buf) {
> +               ret = -ENOMEM;
> +               goto free_kern;
> +       }
> +
> +       ret = kstrtol(parent_buf, 0, &parsed_parent);
> +       if (ret)
> +               goto free_both;
> +
> +       child_buf = kern_buf + first_substring_length + 1;
> +       ret = kstrtol(child_buf, 0, &parsed_child);
> +       if (ret)
> +               goto free_both;
> +
> +       *parent = make_kuid(current_user_ns(), parsed_parent);
> +       if (!uid_valid(*parent)) {
> +               ret = -EINVAL;
> +               goto free_both;
> +       }
> +
> +       *child = make_kuid(current_user_ns(), parsed_child);
> +       if (!uid_valid(*child)) {
> +               ret = -EINVAL;
> +               goto free_both;
> +       }
> +
> +free_both:
> +       kfree(parent_buf);
> +free_kern:
> +       kfree(kern_buf);
> +       return ret;
> +}
> +
> +static ssize_t safesetid_file_write(struct file *file,
> +                                   const char __user *buf,
> +                                   size_t len,
> +                                   loff_t *ppos)
> +{
> +       struct safesetid_file_entry *file_entry =
> +               file->f_inode->i_private;
> +       kuid_t parent;
> +       kuid_t child;
> +       int ret;
> +
> +       if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))

Maybe CAP_MAC_ADMIN instead of (the overloaded) CAP_SYS_ADMIN?

> +               return -EPERM;
> +
> +       if (*ppos != 0)
> +               return -EINVAL;
> +
> +       if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
> +               flush_safesetid_whitelist_entries();
> +               return len;
> +       }
> +
> +       /*
> +        * If we get to here, must be the case that file_entry->type equals
> +        * SAFESETID_WHITELIST_ADD

It seems a bit silly with only two options here, but it'll change for
gids, yes? How about just building a switch() around file_entry->type
instead and avoid needing to refactor this later?

> +        */
> +       ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> +                                                        &child);
> +       if (ret)
> +               return ret;
> +
> +       ret = add_safesetid_whitelist_entry(parent, child);
> +       if (ret)
> +               return ret;
> +
> +       /* Return len on success so caller won't keep trying to write */
> +       return len;
> +}
> +
> +static const struct file_operations safesetid_file_fops = {
> +       .write = safesetid_file_write,
> +};
> +
> +static void safesetid_shutdown_securityfs(void)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +               struct safesetid_file_entry *entry =
> +                       &safesetid_files[i];
> +               securityfs_remove(entry->dentry);
> +               entry->dentry = NULL;
> +       }
> +
> +       securityfs_remove(safesetid_policy_dir);
> +       safesetid_policy_dir = NULL;
> +}
> +
> +static int __init safesetid_init_securityfs(void)
> +{
> +       int i;
> +       int ret;
> +
> +       safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> +       if (!safesetid_policy_dir) {
> +               ret = PTR_ERR(safesetid_policy_dir);
> +               goto error;
> +       }
> +
> +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +               struct safesetid_file_entry *entry =
> +                       &safesetid_files[i];
> +               entry->dentry = securityfs_create_file(
> +                       entry->name, 0200, safesetid_policy_dir,
> +                       entry, &safesetid_file_fops);
> +               if (IS_ERR(entry->dentry)) {
> +                       ret = PTR_ERR(entry->dentry);
> +                       goto error;
> +               }
> +       }
> +
> +       return 0;
> +
> +error:
> +       safesetid_shutdown_securityfs();
> +       return ret;
> +}
> +fs_initcall(safesetid_init_securityfs);
> --
> 2.20.1.97.g81188d93c3-goog
>

But overall, it looks good to me. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2019-01-11 17:13                                 ` [PATCH v2] " mortonm
  2019-01-15  0:38                                   ` Kees Cook
@ 2019-01-15  4:07                                   ` James Morris
  2019-01-15 19:42                                     ` Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: James Morris @ 2019-01-15  4:07 UTC (permalink / raw)
  To: Micah Morton; +Cc: serge, keescook, casey, sds, linux-security-module

On Fri, 11 Jan 2019, mortonm@chromium.org wrote:

> From: Micah Morton <mortonm@chromium.org>
> 
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
> 

I can't recall if this has been mentioned, but is this code already 
shipping in any distros or products, and are any distros planning on 
enabling this feature?



- James
-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c
  2019-01-15  0:38                                   ` Kees Cook
@ 2019-01-15 18:04                                     ` mortonm
  2019-01-15 19:34                                       ` Kees Cook
  2019-01-15 18:04                                     ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls mortonm
  2019-01-15 19:49                                     ` [PATCH v2] " Micah Morton
  2 siblings, 1 reply; 88+ messages in thread
From: mortonm @ 2019-01-15 18:04 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This change ensures that the set*uid family of syscalls in kernel/sys.c
(setreuid, setuid, setresuid, setfsuid) all call ns_capable_common with
the CAP_OPT_INSETID flag, so capability checks in the security_capable
hook can know whether they are being called from within a set*uid
syscall. This change is a no-op by itself, but is needed for the
proposed SafeSetID LSM.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
These changes used to be part of the main SafeSetID LSM patch set.

 include/linux/capability.h |  5 +++++
 kernel/capability.c        | 19 +++++++++++++++++++
 kernel/sys.c               | 10 +++++-----
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index f640dcbc880c..c3f9a4d558a0 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -209,6 +209,7 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
 extern bool capable(int cap);
 extern bool ns_capable(struct user_namespace *ns, int cap);
 extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
+extern bool ns_capable_setid(struct user_namespace *ns, int cap);
 #else
 static inline bool has_capability(struct task_struct *t, int cap)
 {
@@ -240,6 +241,10 @@ static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
 	return true;
 }
+static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
+{
+	return true;
+}
 #endif /* CONFIG_MULTIUSER */
 extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
diff --git a/kernel/capability.c b/kernel/capability.c
index 7718d7dcadc7..e0734ace5bc2 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -417,6 +417,25 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
+/**
+ * ns_capable_setid - Determine if the current task has a superior capability
+ * in effect, while signalling that this check is being done from within a
+ * setid syscall.
+ * @ns:  The usernamespace we want the capability in
+ * @cap: The capability to be tested for
+ *
+ * Return true if the current task has the given superior capability currently
+ * available for use, false if not.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool ns_capable_setid(struct user_namespace *ns, int cap)
+{
+	return ns_capable_common(ns, cap, CAP_OPT_INSETID);
+}
+EXPORT_SYMBOL(ns_capable_setid);
+
 /**
  * capable - Determine if the current task has a superior capability in effect
  * @cap: The capability to be tested for
diff --git a/kernel/sys.c b/kernel/sys.c
index a48cbf1414b8..a98061c1a124 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -516,7 +516,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
 		new->uid = kruid;
 		if (!uid_eq(old->uid, kruid) &&
 		    !uid_eq(old->euid, kruid) &&
-		    !ns_capable(old->user_ns, CAP_SETUID))
+		    !ns_capable_setid(old->user_ns, CAP_SETUID))
 			goto error;
 	}
 
@@ -525,7 +525,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
 		if (!uid_eq(old->uid, keuid) &&
 		    !uid_eq(old->euid, keuid) &&
 		    !uid_eq(old->suid, keuid) &&
-		    !ns_capable(old->user_ns, CAP_SETUID))
+		    !ns_capable_setid(old->user_ns, CAP_SETUID))
 			goto error;
 	}
 
@@ -584,7 +584,7 @@ long __sys_setuid(uid_t uid)
 	old = current_cred();
 
 	retval = -EPERM;
-	if (ns_capable(old->user_ns, CAP_SETUID)) {
+	if (ns_capable_setid(old->user_ns, CAP_SETUID)) {
 		new->suid = new->uid = kuid;
 		if (!uid_eq(kuid, old->uid)) {
 			retval = set_user(new);
@@ -646,7 +646,7 @@ long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid)
 	old = current_cred();
 
 	retval = -EPERM;
-	if (!ns_capable(old->user_ns, CAP_SETUID)) {
+	if (!ns_capable_setid(old->user_ns, CAP_SETUID)) {
 		if (ruid != (uid_t) -1        && !uid_eq(kruid, old->uid) &&
 		    !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid))
 			goto error;
@@ -814,7 +814,7 @@ long __sys_setfsuid(uid_t uid)
 
 	if (uid_eq(kuid, old->uid)  || uid_eq(kuid, old->euid)  ||
 	    uid_eq(kuid, old->suid) || uid_eq(kuid, old->fsuid) ||
-	    ns_capable(old->user_ns, CAP_SETUID)) {
+	    ns_capable_setid(old->user_ns, CAP_SETUID)) {
 		if (!uid_eq(kuid, old->fsuid)) {
 			new->fsuid = kuid;
 			if (security_task_fix_setuid(new, old, LSM_SETID_FS) == 0)
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-15  0:38                                   ` Kees Cook
  2019-01-15 18:04                                     ` [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c mortonm
@ 2019-01-15 18:04                                     ` mortonm
  2019-01-15 19:44                                       ` Kees Cook
  2019-01-15 19:49                                     ` [PATCH v2] " Micah Morton
  2 siblings, 1 reply; 88+ messages in thread
From: mortonm @ 2019-01-15 18:04 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 25711 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Changes since the last patch set: Pulled out the "no-op" changes that
mark setid call sites in kernel/sys.c into a separate patch, and made
other small mods proposed by Kees Cook. NOTE: this patch is still using
do_exit(SIGKILL) to kill the process in check_uid_transition in lsm.c.
This may need to change, pending further discussion.
 Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  12 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 266 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 +++
 security/safesetid/securityfs.c             | 185 ++++++++++++++
 9 files changed, 611 insertions(+)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..ffb64be67f7a
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,107 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+The main use case for this LSM is to allow a non-root program to transition to
+other untrusted uids without full blown CAP_SETUID capabilities. The non-root
+program would still need CAP_SETUID to do any kind of transition, but the
+additional restrictions imposed by this LSM would mean it is a "safer" version
+of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
+do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
+namespace). The higher level goal is to allow for uid-based sandboxing of system
+services without having to give out CAP_SETUID all over the place just so that
+non-root programs can drop to even-lesser-privileged uids. This is especially
+relevant when one non-root daemon on the system should be allowed to spawn other
+processes as different uids, but its undesirable to give the daemon a
+basically-root-equivalent CAP_SETUID.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index 9842e21afd4a..a6ba95fbaa9f 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -46,3 +46,4 @@ subdirectories.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/security/Kconfig b/security/Kconfig
index 78dc12b7eeb3..9efc7a5e3280 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -236,6 +236,7 @@ source "security/tomoyo/Kconfig"
 source "security/apparmor/Kconfig"
 source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
+source "security/safesetid/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..c598b904938f 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..bf89a47ffcc8
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,12 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..aa7bd3323751
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,266 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      unsigned int opts)
+{
+	if (cap == CAP_SETUID &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		if (!(opts & CAP_OPT_INSETID)) {
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
+				__kuid_val(cred->uid));
+			return -1;
+                }
+	}
+	return 0;
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+        /*
+         * Kill this process to avoid potential security vulnerabilities
+         * that could arise from a missing whitelist entry preventing a
+         * privileged process from dropping to a lesser-privileged one.
+         */
+        do_exit(SIGKILL);
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+
+DEFINE_LSM(safesetid_security_init) = {
+	.init = safesetid_security_init,
+};
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..c3ce7b63b4af
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,185 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+        switch (file_entry->type) {
+        case SAFESETID_WHITELIST_FLUSH:
+                flush_safesetid_whitelist_entries();
+        case SAFESETID_WHITELIST_ADD:
+                ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+                                                                 &child);
+                if (ret)
+                        return ret;
+
+                ret = add_safesetid_whitelist_entry(parent, child);
+                if (ret)
+                        return ret;
+        }
+
+        /* Return len on success so caller won't keep trying to write */
+        return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c
  2019-01-15 18:04                                     ` [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c mortonm
@ 2019-01-15 19:34                                       ` Kees Cook
  0 siblings, 0 replies; 88+ messages in thread
From: Kees Cook @ 2019-01-15 19:34 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, Jan 15, 2019 at 10:04 AM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> This change ensures that the set*uid family of syscalls in kernel/sys.c
> (setreuid, setuid, setresuid, setfsuid) all call ns_capable_common with
> the CAP_OPT_INSETID flag, so capability checks in the security_capable
> hook can know whether they are being called from within a set*uid
> syscall. This change is a no-op by itself, but is needed for the
> proposed SafeSetID LSM.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
> These changes used to be part of the main SafeSetID LSM patch set.
>
>  include/linux/capability.h |  5 +++++
>  kernel/capability.c        | 19 +++++++++++++++++++
>  kernel/sys.c               | 10 +++++-----
>  3 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index f640dcbc880c..c3f9a4d558a0 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -209,6 +209,7 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
>  extern bool capable(int cap);
>  extern bool ns_capable(struct user_namespace *ns, int cap);
>  extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
> +extern bool ns_capable_setid(struct user_namespace *ns, int cap);
>  #else
>  static inline bool has_capability(struct task_struct *t, int cap)
>  {
> @@ -240,6 +241,10 @@ static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  {
>         return true;
>  }
> +static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
> +{
> +       return true;
> +}
>  #endif /* CONFIG_MULTIUSER */
>  extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
>  extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 7718d7dcadc7..e0734ace5bc2 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -417,6 +417,25 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  }
>  EXPORT_SYMBOL(ns_capable_noaudit);
>
> +/**
> + * ns_capable_setid - Determine if the current task has a superior capability
> + * in effect, while signalling that this check is being done from within a
> + * setid syscall.
> + * @ns:  The usernamespace we want the capability in
> + * @cap: The capability to be tested for
> + *
> + * Return true if the current task has the given superior capability currently
> + * available for use, false if not.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +bool ns_capable_setid(struct user_namespace *ns, int cap)
> +{
> +       return ns_capable_common(ns, cap, CAP_OPT_INSETID);
> +}
> +EXPORT_SYMBOL(ns_capable_setid);
> +
>  /**
>   * capable - Determine if the current task has a superior capability in effect
>   * @cap: The capability to be tested for
> diff --git a/kernel/sys.c b/kernel/sys.c
> index a48cbf1414b8..a98061c1a124 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -516,7 +516,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
>                 new->uid = kruid;
>                 if (!uid_eq(old->uid, kruid) &&
>                     !uid_eq(old->euid, kruid) &&
> -                   !ns_capable(old->user_ns, CAP_SETUID))
> +                   !ns_capable_setid(old->user_ns, CAP_SETUID))
>                         goto error;
>         }
>
> @@ -525,7 +525,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
>                 if (!uid_eq(old->uid, keuid) &&
>                     !uid_eq(old->euid, keuid) &&
>                     !uid_eq(old->suid, keuid) &&
> -                   !ns_capable(old->user_ns, CAP_SETUID))
> +                   !ns_capable_setid(old->user_ns, CAP_SETUID))
>                         goto error;
>         }
>
> @@ -584,7 +584,7 @@ long __sys_setuid(uid_t uid)
>         old = current_cred();
>
>         retval = -EPERM;
> -       if (ns_capable(old->user_ns, CAP_SETUID)) {
> +       if (ns_capable_setid(old->user_ns, CAP_SETUID)) {
>                 new->suid = new->uid = kuid;
>                 if (!uid_eq(kuid, old->uid)) {
>                         retval = set_user(new);
> @@ -646,7 +646,7 @@ long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid)
>         old = current_cred();
>
>         retval = -EPERM;
> -       if (!ns_capable(old->user_ns, CAP_SETUID)) {
> +       if (!ns_capable_setid(old->user_ns, CAP_SETUID)) {
>                 if (ruid != (uid_t) -1        && !uid_eq(kruid, old->uid) &&
>                     !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid))
>                         goto error;
> @@ -814,7 +814,7 @@ long __sys_setfsuid(uid_t uid)
>
>         if (uid_eq(kuid, old->uid)  || uid_eq(kuid, old->euid)  ||
>             uid_eq(kuid, old->suid) || uid_eq(kuid, old->fsuid) ||
> -           ns_capable(old->user_ns, CAP_SETUID)) {
> +           ns_capable_setid(old->user_ns, CAP_SETUID)) {
>                 if (!uid_eq(kuid, old->fsuid)) {
>                         new->fsuid = kuid;
>                         if (security_task_fix_setuid(new, old, LSM_SETID_FS) == 0)
> --
> 2.20.1.97.g81188d93c3-goog
>


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2019-01-15  4:07                                   ` James Morris
@ 2019-01-15 19:42                                     ` Micah Morton
  0 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-15 19:42 UTC (permalink / raw)
  To: James Morris
  Cc: Serge E. Hallyn, Kees Cook, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Mon, Jan 14, 2019 at 8:07 PM James Morris <jmorris@namei.org> wrote:
>
> On Fri, 11 Jan 2019, mortonm@chromium.org wrote:
>
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
>
> I can't recall if this has been mentioned, but is this code already
> shipping in any distros or products, and are any distros planning on
> enabling this feature?

It is shipping on ChromeOS (the hooking is done in our own LSM that we
maintain, but everything else is the same, and we have integration
tests for it). We use it to lock down a handful of system daemons that
need to switch to certain, predetermined UIDs on the system (but not
root). There look to be a few use cases for this LSM in Android as
well, which is a possibility in the future.

>
>
>
> - James
> --
> James Morris
> <jmorris@namei.org>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-15 18:04                                     ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls mortonm
@ 2019-01-15 19:44                                       ` Kees Cook
  2019-01-15 21:50                                         ` [PATCH v4 " mortonm
  2019-01-15 21:58                                         ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
  0 siblings, 2 replies; 88+ messages in thread
From: Kees Cook @ 2019-01-15 19:44 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, Jan 15, 2019 at 10:04 AM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> Changes since the last patch set: Pulled out the "no-op" changes that
> mark setid call sites in kernel/sys.c into a separate patch, and made
> other small mods proposed by Kees Cook. NOTE: this patch is still using
> do_exit(SIGKILL) to kill the process in check_uid_transition in lsm.c.
> This may need to change, pending further discussion.
>  Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
>  Documentation/admin-guide/LSM/index.rst     |   1 +
>  security/Kconfig                            |   1 +
>  security/Makefile                           |   2 +
>  security/safesetid/Kconfig                  |  12 +
>  security/safesetid/Makefile                 |   7 +
>  security/safesetid/lsm.c                    | 266 ++++++++++++++++++++
>  security/safesetid/lsm.h                    |  30 +++
>  security/safesetid/securityfs.c             | 185 ++++++++++++++
>  9 files changed, 611 insertions(+)
>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>  create mode 100644 security/safesetid/Kconfig
>  create mode 100644 security/safesetid/Makefile
>  create mode 100644 security/safesetid/lsm.c
>  create mode 100644 security/safesetid/lsm.h
>  create mode 100644 security/safesetid/securityfs.c
>
> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> new file mode 100644
> index 000000000000..ffb64be67f7a
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> @@ -0,0 +1,107 @@
> +=========
> +SafeSetID
> +=========
> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> +UID/GID transitions from a given UID/GID to only those approved by a
> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> +allowing a user to set up user namespace UID mappings.
> +
> +
> +Background
> +==========
> +In absence of file capabilities, processes spawned on a Linux system that need
> +to switch to a different user must be spawned with CAP_SETUID privileges.
> +CAP_SETUID is granted to programs running as root or those running as a non-root
> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> +often preferable to use Linux runtime capabilities rather than file
> +capabilities, since using file capabilities to run a program with elevated
> +privileges opens up possible security holes since any user with access to the
> +file can exec() that program to gain the elevated privileges.
> +
> +While it is possible to implement a tree of processes by giving full
> +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> +tree of processes under non-root user(s) in the first place. Specifically,
> +since CAP_SETUID allows changing to any user on the system, including the root
> +user, it is an overpowered capability for what is needed in this scenario,
> +especially since programs often only call setuid() to drop privileges to a
> +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> +generally feasible way in Linux to restrict the potential UIDs that a user can
> +switch to through setuid() beyond allowing a switch to any user on the system.
> +This SafeSetID LSM seeks to provide a solution for restricting setid
> +capabilities in such a way.
> +
> +The main use case for this LSM is to allow a non-root program to transition to
> +other untrusted uids without full blown CAP_SETUID capabilities. The non-root
> +program would still need CAP_SETUID to do any kind of transition, but the
> +additional restrictions imposed by this LSM would mean it is a "safer" version
> +of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
> +do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
> +namespace). The higher level goal is to allow for uid-based sandboxing of system
> +services without having to give out CAP_SETUID all over the place just so that
> +non-root programs can drop to even-lesser-privileged uids. This is especially
> +relevant when one non-root daemon on the system should be allowed to spawn other
> +processes as different uids, but its undesirable to give the daemon a
> +basically-root-equivalent CAP_SETUID.
> +
> +
> +Other Approaches Considered
> +===========================
> +
> +Solve this problem in userspace
> +-------------------------------
> +For candidate applications that would like to have restricted setid capabilities
> +as implemented in this LSM, an alternative option would be to simply take away
> +setid capabilities from the application completely and refactor the process
> +spawning semantics in the application (e.g. by using a privileged helper program
> +to do process spawning and UID/GID transitions). Unfortunately, there are a
> +number of semantics around process spawning that would be affected by this, such
> +as fork() calls where the program doesn’t immediately call exec() after the
> +fork(), parent processes specifying custom environment variables or command line
> +args for spawned child processes, or inheritance of file handles across a
> +fork()/exec(). Because of this, as solution that uses a privileged helper in
> +userspace would likely be less appealing to incorporate into existing projects
> +that rely on certain process-spawning semantics in Linux.
> +
> +Use user namespaces
> +-------------------
> +Another possible approach would be to run a given process tree in its own user
> +namespace and give programs in the tree setid capabilities. In this way,
> +programs in the tree could change to any desired UID/GID in the context of their
> +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> +initial system user namespace, affectively preventing privilege escalation.
> +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> +without pairing them with other namespace types, which is not always an option.
> +Linux checks for capabilities based off of the user namespace that “owns” some
> +entity. For example, Linux has the notion that network namespaces are owned by
> +the user namespace in which they were created. A consequence of this is that
> +capability checks for access to a given network namespace are done by checking
> +whether a task has the given capability in the context of the user namespace
> +that owns the network namespace -- not necessarily the user namespace under
> +which the given task runs. Therefore spawning a process in a new user namespace
> +effectively prevents it from accessing the network namespace owned by the
> +initial namespace. This is a deal-breaker for any application that expects to
> +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> +configurations. Using user namespaces in isolation causes problems regarding
> +other system interactions, including use of pid namespaces and device creation.
> +
> +Use an existing LSM
> +-------------------
> +None of the other in-tree LSMs have the capability to gate setid transitions, or
> +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> +"Since setuid only affects the current process, and since the SELinux controls
> +are not based on the Linux identity attributes, SELinux does not need to control
> +this operation."
> +
> +
> +Directions for use
> +==================
> +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> +applicable restriction policy is in place. Policies are configured through
> +securityfs by writing to the safesetid/add_whitelist_policy and
> +safesetid/flush_whitelist_policies files at the location where securityfs is
> +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> +numbers, such as '123:456'. To flush the policies, any write to the file is
> +sufficient. Again, configuring a policy for a UID will prevent that UID from
> +obtaining auxiliary setid privileges, such as allowing a user to set up user
> +namespace UID mappings.
> diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> index 9842e21afd4a..a6ba95fbaa9f 100644
> --- a/Documentation/admin-guide/LSM/index.rst
> +++ b/Documentation/admin-guide/LSM/index.rst
> @@ -46,3 +46,4 @@ subdirectories.
>     Smack
>     tomoyo
>     Yama
> +   SafeSetID
> diff --git a/security/Kconfig b/security/Kconfig
> index 78dc12b7eeb3..9efc7a5e3280 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -236,6 +236,7 @@ source "security/tomoyo/Kconfig"
>  source "security/apparmor/Kconfig"
>  source "security/loadpin/Kconfig"
>  source "security/yama/Kconfig"
> +source "security/safesetid/Kconfig"
>
>  source "security/integrity/Kconfig"
>
> diff --git a/security/Makefile b/security/Makefile
> index 4d2d3782ddef..c598b904938f 100644
> --- a/security/Makefile
> +++ b/security/Makefile
> @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
>  subdir-$(CONFIG_SECURITY_APPARMOR)     += apparmor
>  subdir-$(CONFIG_SECURITY_YAMA)         += yama
>  subdir-$(CONFIG_SECURITY_LOADPIN)      += loadpin
> +subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
>
>  # always enable default capabilities
>  obj-y                                  += commoncap.o
> @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)         += tomoyo/
>  obj-$(CONFIG_SECURITY_APPARMOR)                += apparmor/
>  obj-$(CONFIG_SECURITY_YAMA)            += yama/
>  obj-$(CONFIG_SECURITY_LOADPIN)         += loadpin/
> +obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
>  obj-$(CONFIG_CGROUP_DEVICE)            += device_cgroup.o

Given the refactoring of the LSM enabling logic, you'll need to do
some minor merging with the linux-next tree to get this to apply to
security-next. That would make James's life easier, I think, though
maybe James can speak to that, since I'm not sure how the trees are
split right now.

>
>  # Object integrity file lists
> diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> new file mode 100644
> index 000000000000..bf89a47ffcc8
> --- /dev/null
> +++ b/security/safesetid/Kconfig
> @@ -0,0 +1,12 @@
> +config SECURITY_SAFESETID
> +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> +        default n
> +        help
> +          SafeSetID is an LSM module that gates the setid family of syscalls to
> +          restrict UID/GID transitions from a given UID/GID to only those
> +          approved by a system-wide whitelist. These restrictions also prohibit
> +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> +          UID mappings.
> +
> +          If you are unsure how to answer this question, answer N.
> diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> new file mode 100644
> index 000000000000..6b0660321164
> --- /dev/null
> +++ b/security/safesetid/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for the safesetid LSM.
> +#
> +
> +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> +safesetid-y := lsm.o securityfs.o
> diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> new file mode 100644
> index 000000000000..aa7bd3323751
> --- /dev/null
> +++ b/security/safesetid/lsm.c
> @@ -0,0 +1,266 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +
> +#define pr_fmt(fmt) "SafeSetID: " fmt
> +
> +#include <asm/syscall.h>
> +#include <linux/hashtable.h>
> +#include <linux/lsm_hooks.h>
> +#include <linux/module.h>
> +#include <linux/ptrace.h>
> +#include <linux/sched/task_stack.h>
> +#include <linux/security.h>
> +
> +#define NUM_BITS 8 /* 128 buckets in hash table */
> +
> +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> +
> +/*
> + * Hash table entry to store safesetid policy signifying that 'parent' user
> + * can setid to 'child' user.
> + */
> +struct entry {
> +       struct hlist_node next;
> +       struct hlist_node dlist; /* for deletion cleanup */
> +       uint64_t parent_kuid;
> +       uint64_t child_kuid;
> +};
> +
> +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> +
> +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> +{
> +       struct entry *entry;
> +
> +       rcu_read_lock();
> +       hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> +                                  entry, next, __kuid_val(parent)) {
> +               if (entry->parent_kuid == __kuid_val(parent)) {
> +                       rcu_read_unlock();
> +                       return true;
> +               }
> +       }
> +       rcu_read_unlock();
> +
> +       return false;
> +}
> +
> +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> +                                                   kuid_t child)
> +{
> +       struct entry *entry;
> +
> +       rcu_read_lock();
> +       hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> +                                  entry, next, __kuid_val(parent)) {
> +               if (entry->parent_kuid == __kuid_val(parent) &&
> +                   entry->child_kuid == __kuid_val(child)) {
> +                       rcu_read_unlock();
> +                       return true;
> +               }
> +       }
> +       rcu_read_unlock();
> +
> +       return false;
> +}
> +
> +static int safesetid_security_capable(const struct cred *cred,
> +                                     struct user_namespace *ns,
> +                                     int cap,
> +                                     unsigned int opts)
> +{
> +       if (cap == CAP_SETUID &&
> +           check_setuid_policy_hashtable_key(cred->uid)) {
> +               if (!(opts & CAP_OPT_INSETID)) {
> +                       /*
> +                        * Deny if we're not in a set*uid() syscall to avoid
> +                        * giving powers gated by CAP_SETUID that are related
> +                        * to functionality other than calling set*uid() (e.g.
> +                        * allowing user to set up userns uid mappings).
> +                        */
> +                       pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> +                               __kuid_val(cred->uid));
> +                       return -1;
> +                }
> +       }
> +       return 0;
> +}
> +
> +static int check_uid_transition(kuid_t parent, kuid_t child)
> +{
> +       if (check_setuid_policy_hashtable_key_value(parent, child))
> +               return 0;
> +       pr_warn("UID transition (%d -> %d) blocked",
> +               __kuid_val(parent),
> +               __kuid_val(child));
> +        /*
> +         * Kill this process to avoid potential security vulnerabilities
> +         * that could arise from a missing whitelist entry preventing a
> +         * privileged process from dropping to a lesser-privileged one.
> +         */
> +        do_exit(SIGKILL);
> +}

This needs double-checking, but I think you want this, to avoid
missing various process clean-up steps (like performing a core dump if
desired, etc):

force_sig(SIGKILL, current);
return -EACCES;

But please double-check that a rejected setuid() syscall never
completes and the process does die with SIGKILL.

> +
> +/*
> + * Check whether there is either an exception for user under old cred struct to
> + * set*uid to user under new cred struct, or the UID transition is allowed (by
> + * Linux set*uid rules) even without CAP_SETUID.
> + */
> +static int safesetid_task_fix_setuid(struct cred *new,
> +                                    const struct cred *old,
> +                                    int flags)
> +{
> +
> +       /* Do nothing if there are no setuid restrictions for this UID. */
> +       if (!check_setuid_policy_hashtable_key(old->uid))
> +               return 0;
> +
> +       switch (flags) {
> +       case LSM_SETID_RE:
> +               /*
> +                * Users for which setuid restrictions exist can only set the
> +                * real UID to the real UID or the effective UID, unless an
> +                * explicit whitelist policy allows the transition.
> +                */
> +               if (!uid_eq(old->uid, new->uid) &&
> +                       !uid_eq(old->euid, new->uid)) {
> +                       return check_uid_transition(old->uid, new->uid);
> +               }
> +               /*
> +                * Users for which setuid restrictions exist can only set the
> +                * effective UID to the real UID, the effective UID, or the
> +                * saved set-UID, unless an explicit whitelist policy allows
> +                * the transition.
> +                */
> +               if (!uid_eq(old->uid, new->euid) &&
> +                       !uid_eq(old->euid, new->euid) &&
> +                       !uid_eq(old->suid, new->euid)) {
> +                       return check_uid_transition(old->euid, new->euid);
> +               }
> +               break;
> +       case LSM_SETID_ID:
> +               /*
> +                * Users for which setuid restrictions exist cannot change the
> +                * real UID or saved set-UID unless an explicit whitelist
> +                * policy allows the transition.
> +                */
> +               if (!uid_eq(old->uid, new->uid))
> +                       return check_uid_transition(old->uid, new->uid);
> +               if (!uid_eq(old->suid, new->suid))
> +                       return check_uid_transition(old->suid, new->suid);
> +               break;
> +       case LSM_SETID_RES:
> +               /*
> +                * Users for which setuid restrictions exist cannot change the
> +                * real UID, effective UID, or saved set-UID to anything but
> +                * one of: the current real UID, the current effective UID or
> +                * the current saved set-user-ID unless an explicit whitelist
> +                * policy allows the transition.
> +                */
> +               if (!uid_eq(new->uid, old->uid) &&
> +                       !uid_eq(new->uid, old->euid) &&
> +                       !uid_eq(new->uid, old->suid)) {
> +                       return check_uid_transition(old->uid, new->uid);
> +               }
> +               if (!uid_eq(new->euid, old->uid) &&
> +                       !uid_eq(new->euid, old->euid) &&
> +                       !uid_eq(new->euid, old->suid)) {
> +                       return check_uid_transition(old->euid, new->euid);
> +               }
> +               if (!uid_eq(new->suid, old->uid) &&
> +                       !uid_eq(new->suid, old->euid) &&
> +                       !uid_eq(new->suid, old->suid)) {
> +                       return check_uid_transition(old->suid, new->suid);
> +               }
> +               break;
> +       case LSM_SETID_FS:
> +               /*
> +                * Users for which setuid restrictions exist cannot change the
> +                * filesystem UID to anything but one of: the current real UID,
> +                * the current effective UID or the current saved set-UID
> +                * unless an explicit whitelist policy allows the transition.
> +                */
> +               if (!uid_eq(new->fsuid, old->uid)  &&
> +                       !uid_eq(new->fsuid, old->euid)  &&
> +                       !uid_eq(new->fsuid, old->suid) &&
> +                       !uid_eq(new->fsuid, old->fsuid)) {
> +                       return check_uid_transition(old->fsuid, new->fsuid);
> +               }
> +               break;

As a robustness measure can you add a default case here that will
"fail closed"? Something like:

default:
    WARN_ON_ONCE("Unknown setid state %d\n", flags);
    force_sig(SIGKILL, current);
    return -EINVAL;

> +       }
> +       return 0;
> +}
> +
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> +{
> +       struct entry *new;
> +
> +       /* Return if entry already exists */
> +       if (check_setuid_policy_hashtable_key_value(parent, child))
> +               return 0;
> +
> +       new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> +       if (!new)
> +               return -ENOMEM;
> +       new->parent_kuid = __kuid_val(parent);
> +       new->child_kuid = __kuid_val(child);
> +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +       hash_add_rcu(safesetid_whitelist_hashtable,
> +                    &new->next,
> +                    __kuid_val(parent));
> +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +       return 0;
> +}
> +
> +void flush_safesetid_whitelist_entries(void)
> +{
> +       struct entry *entry;
> +       struct hlist_node *hlist_node;
> +       unsigned int bkt_loop_cursor;
> +       HLIST_HEAD(free_list);
> +
> +       /*
> +        * Could probably use hash_for_each_rcu here instead, but this should
> +        * be fine as well.
> +        */
> +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +       hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> +                          hlist_node, entry, next) {
> +               hash_del_rcu(&entry->next);
> +               hlist_add_head(&entry->dlist, &free_list);
> +       }
> +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +       synchronize_rcu();
> +       hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> +               hlist_del(&entry->dlist);
> +               kfree(entry);
> +       }
> +}
> +
> +static struct security_hook_list safesetid_security_hooks[] = {
> +       LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> +       LSM_HOOK_INIT(capable, safesetid_security_capable)
> +};
> +
> +static int __init safesetid_security_init(void)
> +{
> +       security_add_hooks(safesetid_security_hooks,
> +                          ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> +
> +       return 0;
> +}
> +
> +DEFINE_LSM(safesetid_security_init) = {
> +       .init = safesetid_security_init,
> +};
> diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> new file mode 100644
> index 000000000000..bf78af9bf314
> --- /dev/null
> +++ b/security/safesetid/lsm.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#ifndef _SAFESETID_H
> +#define _SAFESETID_H
> +
> +#include <linux/types.h>
> +
> +/* Function type. */
> +enum safesetid_whitelist_file_write_type {
> +       SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> +       SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> +};
> +
> +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> +
> +void flush_safesetid_whitelist_entries(void);
> +
> +#endif /* _SAFESETID_H */
> diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> new file mode 100644
> index 000000000000..c3ce7b63b4af
> --- /dev/null
> +++ b/security/safesetid/securityfs.c
> @@ -0,0 +1,185 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#include <linux/security.h>
> +#include <linux/cred.h>
> +
> +#include "lsm.h"
> +
> +static struct dentry *safesetid_policy_dir;
> +
> +struct safesetid_file_entry {
> +       const char *name;
> +       enum safesetid_whitelist_file_write_type type;
> +       struct dentry *dentry;
> +};
> +
> +static struct safesetid_file_entry safesetid_files[] = {
> +       {.name = "add_whitelist_policy",
> +        .type = SAFESETID_WHITELIST_ADD},
> +       {.name = "flush_whitelist_policies",
> +        .type = SAFESETID_WHITELIST_FLUSH},
> +};
> +
> +/*
> + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> + * variables pointed to by 'parent' and 'child' will get updated but this
> + * function will return an error.
> + */
> +static int parse_safesetid_whitelist_policy(const char __user *buf,
> +                                           size_t len,
> +                                           kuid_t *parent,
> +                                           kuid_t *child)
> +{
> +       char *kern_buf;
> +       char *parent_buf;
> +       char *child_buf;
> +       const char separator[] = ":";
> +       int ret;
> +       size_t first_substring_length;
> +       long parsed_parent;
> +       long parsed_child;
> +
> +       /* Duplicate string from user memory and NULL-terminate */
> +       kern_buf = memdup_user_nul(buf, len);
> +       if (IS_ERR(kern_buf))
> +               return PTR_ERR(kern_buf);
> +
> +       /*
> +        * Format of |buf| string should be <UID>:<UID>.
> +        * Find location of ":" in kern_buf (copied from |buf|).
> +        */
> +       first_substring_length = strcspn(kern_buf, separator);
> +       if (first_substring_length == 0 || first_substring_length == len) {
> +               ret = -EINVAL;
> +               goto free_kern;
> +       }
> +
> +       parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> +       if (!parent_buf) {
> +               ret = -ENOMEM;
> +               goto free_kern;
> +       }
> +
> +       ret = kstrtol(parent_buf, 0, &parsed_parent);
> +       if (ret)
> +               goto free_both;
> +
> +       child_buf = kern_buf + first_substring_length + 1;
> +       ret = kstrtol(child_buf, 0, &parsed_child);
> +       if (ret)
> +               goto free_both;
> +
> +       *parent = make_kuid(current_user_ns(), parsed_parent);
> +       if (!uid_valid(*parent)) {
> +               ret = -EINVAL;
> +               goto free_both;
> +       }
> +
> +       *child = make_kuid(current_user_ns(), parsed_child);
> +       if (!uid_valid(*child)) {
> +               ret = -EINVAL;
> +               goto free_both;
> +       }
> +
> +free_both:
> +       kfree(parent_buf);
> +free_kern:
> +       kfree(kern_buf);
> +       return ret;
> +}
> +
> +static ssize_t safesetid_file_write(struct file *file,
> +                                   const char __user *buf,
> +                                   size_t len,
> +                                   loff_t *ppos)
> +{
> +       struct safesetid_file_entry *file_entry =
> +               file->f_inode->i_private;
> +       kuid_t parent;
> +       kuid_t child;
> +       int ret;
> +
> +       if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
> +               return -EPERM;
> +
> +       if (*ppos != 0)
> +               return -EINVAL;
> +
> +        switch (file_entry->type) {
> +        case SAFESETID_WHITELIST_FLUSH:
> +                flush_safesetid_whitelist_entries();

missing break?

> +        case SAFESETID_WHITELIST_ADD:
> +                ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> +                                                                 &child);
> +                if (ret)
> +                        return ret;
> +
> +                ret = add_safesetid_whitelist_entry(parent, child);
> +                if (ret)
> +                        return ret;

And add a default here too, something like:

default:
    WARN_ON_ONCE("Unknown securityfs file %d!?\n", file_entry->type);
    break;

> +        }
> +
> +        /* Return len on success so caller won't keep trying to write */
> +        return len;
> +}
> +
> +static const struct file_operations safesetid_file_fops = {
> +       .write = safesetid_file_write,
> +};
> +
> +static void safesetid_shutdown_securityfs(void)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +               struct safesetid_file_entry *entry =
> +                       &safesetid_files[i];
> +               securityfs_remove(entry->dentry);
> +               entry->dentry = NULL;
> +       }
> +
> +       securityfs_remove(safesetid_policy_dir);
> +       safesetid_policy_dir = NULL;
> +}
> +
> +static int __init safesetid_init_securityfs(void)
> +{
> +       int i;
> +       int ret;
> +
> +       safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> +       if (!safesetid_policy_dir) {
> +               ret = PTR_ERR(safesetid_policy_dir);
> +               goto error;
> +       }
> +
> +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +               struct safesetid_file_entry *entry =
> +                       &safesetid_files[i];
> +               entry->dentry = securityfs_create_file(
> +                       entry->name, 0200, safesetid_policy_dir,
> +                       entry, &safesetid_file_fops);
> +               if (IS_ERR(entry->dentry)) {
> +                       ret = PTR_ERR(entry->dentry);
> +                       goto error;
> +               }
> +       }
> +
> +       return 0;
> +
> +error:
> +       safesetid_shutdown_securityfs();
> +       return ret;
> +}
> +fs_initcall(safesetid_init_securityfs);
> --
> 2.20.1.97.g81188d93c3-goog
>

And if I didn't say it before, thank you for the docs on this too! :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2019-01-15  0:38                                   ` Kees Cook
  2019-01-15 18:04                                     ` [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c mortonm
  2019-01-15 18:04                                     ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls mortonm
@ 2019-01-15 19:49                                     ` Micah Morton
  2019-01-15 19:53                                       ` Kees Cook
  2 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-15 19:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Mon, Jan 14, 2019 at 4:38 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Jan 11, 2019 at 9:13 AM <mortonm@chromium.org> wrote:
> >
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> > Changes since the last patch set: Rebase after commit
> > a35ce66b801631823fc78c8a78d104f9c0976867 got applied to next-general.
> > As a result of that commit, we can remove the changes in arch/ and the
> > setuid_syscall function in lsm.c, since this code no longer needs to do
> > arch-specific operations to see if security_capable is being called from
> > a setid syscall. Instead, we add the ns_capable_insetid function and
> > call it from the setid syscalls in kernel/sys.c (rather than calling the
> > original ns_capable function), which allows us to signal to the
> > security_capable hooks whether the hook is being called from within a
> > setid syscall.
>
> I would split this patch into two halfs: the "no op" change that
> "marks" all the setid call sites in the first patch, then the LSM
> itself in the second patch.

Done.

>
> > +bool ns_capable_insetid(struct user_namespace *ns, int cap)
> > +{
> > +       return ns_capable_common(ns, cap, CAP_OPT_INSETID);
> > +}
> > +EXPORT_SYMBOL(ns_capable_insetid);
>
> Since we have the noaudit helper still, using this one seems fine to
> me. I might bikeshed the name to "ns_capable_setid()". If others don't
> want a new helper, then it should be fine to just change the callsites
> to the direct ns_capable_common(ns, cap, CAP_OPT_INSETID).

Done.

>
> > +static int safesetid_security_capable(const struct cred *cred,
> > +                                     struct user_namespace *ns,
> > +                                     int cap,
> > +                                     unsigned int opts)
> > +{
> > +       if (cap == CAP_SETUID &&
> > +           check_setuid_policy_hashtable_key(cred->uid)) {
> > +               if (!(opts & CAP_OPT_INSETID)) {
> > +                       /*
> > +                        * Deny if we're not in a set*uid() syscall to avoid
> > +                        * giving powers gated by CAP_SETUID that are related
> > +                        * to functionality other than calling set*uid() (e.g.
> > +                        * allowing user to set up userns uid mappings).
> > +                        */
> > +                       pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> > +                               __kuid_val(cred->uid));
> > +                       return -1;
> > +                }
> > +       }
> > +       return 0;
> > +}
>
> Much cleaner than the per-arch syscall tests. :)
>
> > +static void setuid_policy_violation(kuid_t parent, kuid_t child)
> > +{
> > +       pr_warn("UID transition (%d -> %d) blocked",
> > +               __kuid_val(parent),
> > +               __kuid_val(child));
> > +        /*
> > +         * Kill this process to avoid potential security vulnerabilities
> > +         * that could arise from a missing whitelist entry preventing a
> > +         * privileged process from dropping to a lesser-privileged one.
> > +         */
> > +        do_exit(SIGKILL);
>
> I think I asked earlier if this should be an unblockable signal raise
> instead of a do_exit(). I don't remember if that got answered?

Could you elaborate on this a bit, or share a pointer to some code/doc
that explains the difference? I don't recall this point being raised
before (might have missed it), and I'm no expert on the different
approaches to killing a process in this way.

>
> > +}
> > +
> > +static int check_uid_transition(kuid_t parent, kuid_t child)
> > +{
> > +       if (check_setuid_policy_hashtable_key_value(parent, child))
> > +               return 0;
> > +       setuid_policy_violation(parent, child);
> > +       return -1;
> > +}
>
> Any reason not to just collapse setuid_policy_violation() into this function?

Done.

>
> > +
> > +/*
> > + * Check whether there is either an exception for user under old cred struct to
> > + * set*uid to user under new cred struct, or the UID transition is allowed (by
> > + * Linux set*uid rules) even without CAP_SETUID.
> > + */
> > +static int safesetid_task_fix_setuid(struct cred *new,
> > +                                    const struct cred *old,
> > +                                    int flags)
> > +{
> > +
> > +       /* Do nothing if there are no setuid restrictions for this UID. */
> > +       if (!check_setuid_policy_hashtable_key(old->uid))
> > +               return 0;
> > +
> > +       switch (flags) {
> > +       case LSM_SETID_RE:
> > +               /*
> > +                * Users for which setuid restrictions exist can only set the
> > +                * real UID to the real UID or the effective UID, unless an
> > +                * explicit whitelist policy allows the transition.
> > +                */
> > +               if (!uid_eq(old->uid, new->uid) &&
> > +                       !uid_eq(old->euid, new->uid)) {
> > +                       return check_uid_transition(old->uid, new->uid);
> > +               }
> > +               /*
> > +                * Users for which setuid restrictions exist can only set the
> > +                * effective UID to the real UID, the effective UID, or the
> > +                * saved set-UID, unless an explicit whitelist policy allows
> > +                * the transition.
> > +                */
> > +               if (!uid_eq(old->uid, new->euid) &&
> > +                       !uid_eq(old->euid, new->euid) &&
> > +                       !uid_eq(old->suid, new->euid)) {
> > +                       return check_uid_transition(old->euid, new->euid);
> > +               }
> > +               break;
> > +       case LSM_SETID_ID:
> > +               /*
> > +                * Users for which setuid restrictions exist cannot change the
> > +                * real UID or saved set-UID unless an explicit whitelist
> > +                * policy allows the transition.
> > +                */
> > +               if (!uid_eq(old->uid, new->uid))
> > +                       return check_uid_transition(old->uid, new->uid);
> > +               if (!uid_eq(old->suid, new->suid))
> > +                       return check_uid_transition(old->suid, new->suid);
> > +               break;
> > +       case LSM_SETID_RES:
> > +               /*
> > +                * Users for which setuid restrictions exist cannot change the
> > +                * real UID, effective UID, or saved set-UID to anything but
> > +                * one of: the current real UID, the current effective UID or
> > +                * the current saved set-user-ID unless an explicit whitelist
> > +                * policy allows the transition.
> > +                */
> > +               if (!uid_eq(new->uid, old->uid) &&
> > +                       !uid_eq(new->uid, old->euid) &&
> > +                       !uid_eq(new->uid, old->suid)) {
> > +                       return check_uid_transition(old->uid, new->uid);
> > +               }
> > +               if (!uid_eq(new->euid, old->uid) &&
> > +                       !uid_eq(new->euid, old->euid) &&
> > +                       !uid_eq(new->euid, old->suid)) {
> > +                       return check_uid_transition(old->euid, new->euid);
> > +               }
> > +               if (!uid_eq(new->suid, old->uid) &&
> > +                       !uid_eq(new->suid, old->euid) &&
> > +                       !uid_eq(new->suid, old->suid)) {
> > +                       return check_uid_transition(old->suid, new->suid);
> > +               }
> > +               break;
> > +       case LSM_SETID_FS:
> > +               /*
> > +                * Users for which setuid restrictions exist cannot change the
> > +                * filesystem UID to anything but one of: the current real UID,
> > +                * the current effective UID or the current saved set-UID
> > +                * unless an explicit whitelist policy allows the transition.
> > +                */
> > +               if (!uid_eq(new->fsuid, old->uid)  &&
> > +                       !uid_eq(new->fsuid, old->euid)  &&
> > +                       !uid_eq(new->fsuid, old->suid) &&
> > +                       !uid_eq(new->fsuid, old->fsuid)) {
> > +                       return check_uid_transition(old->fsuid, new->fsuid);
> > +               }
> > +               break;
> > +       }
> > +       return 0;
> > +}
> > +
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> > +{
> > +       struct entry *new;
> > +
> > +       /* Return if entry already exists */
> > +       if (check_setuid_policy_hashtable_key_value(parent, child))
> > +               return 0;
> > +
> > +       new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> > +       if (!new)
> > +               return -ENOMEM;
> > +       new->parent_kuid = __kuid_val(parent);
> > +       new->child_kuid = __kuid_val(child);
> > +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +       hash_add_rcu(safesetid_whitelist_hashtable,
> > +                    &new->next,
> > +                    __kuid_val(parent));
> > +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +       return 0;
> > +}
> > +
> > +void flush_safesetid_whitelist_entries(void)
> > +{
> > +       struct entry *entry;
> > +       struct hlist_node *hlist_node;
> > +       unsigned int bkt_loop_cursor;
> > +       HLIST_HEAD(free_list);
> > +
> > +       /*
> > +        * Could probably use hash_for_each_rcu here instead, but this should
> > +        * be fine as well.
> > +        */
> > +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +       hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> > +                          hlist_node, entry, next) {
> > +               hash_del_rcu(&entry->next);
> > +               hlist_add_head(&entry->dlist, &free_list);
> > +       }
> > +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +       synchronize_rcu();
> > +       hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> > +               hlist_del(&entry->dlist);
> > +               kfree(entry);
> > +       }
> > +}
> > +
> > +static struct security_hook_list safesetid_security_hooks[] = {
> > +       LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> > +       LSM_HOOK_INIT(capable, safesetid_security_capable)
> > +};
> > +
> > +static int __init safesetid_security_init(void)
> > +{
> > +       security_add_hooks(safesetid_security_hooks,
> > +                          ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> > +
> > +       return 0;
> > +}
> > +
> > +DEFINE_LSM(safesetid_security_init) = {
> > +       .init = safesetid_security_init,
> > +};
> > diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> > new file mode 100644
> > index 000000000000..bf78af9bf314
> > --- /dev/null
> > +++ b/security/safesetid/lsm.h
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#ifndef _SAFESETID_H
> > +#define _SAFESETID_H
> > +
> > +#include <linux/types.h>
> > +
> > +/* Function type. */
> > +enum safesetid_whitelist_file_write_type {
> > +       SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> > +       SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> > +};
> > +
> > +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> > +
> > +void flush_safesetid_whitelist_entries(void);
> > +
> > +#endif /* _SAFESETID_H */
> > diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> > new file mode 100644
> > index 000000000000..ff5fcf2c1b37
> > --- /dev/null
> > +++ b/security/safesetid/securityfs.c
> > @@ -0,0 +1,189 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#include <linux/security.h>
> > +#include <linux/cred.h>
> > +
> > +#include "lsm.h"
> > +
> > +static struct dentry *safesetid_policy_dir;
> > +
> > +struct safesetid_file_entry {
> > +       const char *name;
> > +       enum safesetid_whitelist_file_write_type type;
> > +       struct dentry *dentry;
> > +};
> > +
> > +static struct safesetid_file_entry safesetid_files[] = {
> > +       {.name = "add_whitelist_policy",
> > +        .type = SAFESETID_WHITELIST_ADD},
> > +       {.name = "flush_whitelist_policies",
> > +        .type = SAFESETID_WHITELIST_FLUSH},
> > +};
> > +
> > +/*
> > + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> > + * variables pointed to by 'parent' and 'child' will get updated but this
> > + * function will return an error.
> > + */
> > +static int parse_safesetid_whitelist_policy(const char __user *buf,
> > +                                           size_t len,
> > +                                           kuid_t *parent,
> > +                                           kuid_t *child)
> > +{
> > +       char *kern_buf;
> > +       char *parent_buf;
> > +       char *child_buf;
> > +       const char separator[] = ":";
> > +       int ret;
> > +       size_t first_substring_length;
> > +       long parsed_parent;
> > +       long parsed_child;
> > +
> > +       /* Duplicate string from user memory and NULL-terminate */
> > +       kern_buf = memdup_user_nul(buf, len);
> > +       if (IS_ERR(kern_buf))
> > +               return PTR_ERR(kern_buf);
> > +
> > +       /*
> > +        * Format of |buf| string should be <UID>:<UID>.
> > +        * Find location of ":" in kern_buf (copied from |buf|).
> > +        */
> > +       first_substring_length = strcspn(kern_buf, separator);
> > +       if (first_substring_length == 0 || first_substring_length == len) {
> > +               ret = -EINVAL;
> > +               goto free_kern;
> > +       }
> > +
> > +       parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> > +       if (!parent_buf) {
> > +               ret = -ENOMEM;
> > +               goto free_kern;
> > +       }
> > +
> > +       ret = kstrtol(parent_buf, 0, &parsed_parent);
> > +       if (ret)
> > +               goto free_both;
> > +
> > +       child_buf = kern_buf + first_substring_length + 1;
> > +       ret = kstrtol(child_buf, 0, &parsed_child);
> > +       if (ret)
> > +               goto free_both;
> > +
> > +       *parent = make_kuid(current_user_ns(), parsed_parent);
> > +       if (!uid_valid(*parent)) {
> > +               ret = -EINVAL;
> > +               goto free_both;
> > +       }
> > +
> > +       *child = make_kuid(current_user_ns(), parsed_child);
> > +       if (!uid_valid(*child)) {
> > +               ret = -EINVAL;
> > +               goto free_both;
> > +       }
> > +
> > +free_both:
> > +       kfree(parent_buf);
> > +free_kern:
> > +       kfree(kern_buf);
> > +       return ret;
> > +}
> > +
> > +static ssize_t safesetid_file_write(struct file *file,
> > +                                   const char __user *buf,
> > +                                   size_t len,
> > +                                   loff_t *ppos)
> > +{
> > +       struct safesetid_file_entry *file_entry =
> > +               file->f_inode->i_private;
> > +       kuid_t parent;
> > +       kuid_t child;
> > +       int ret;
> > +
> > +       if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
>
> Maybe CAP_MAC_ADMIN instead of (the overloaded) CAP_SYS_ADMIN?

Makes sense. Done.

>
> > +               return -EPERM;
> > +
> > +       if (*ppos != 0)
> > +               return -EINVAL;
> > +
> > +       if (file_entry->type == SAFESETID_WHITELIST_FLUSH) {
> > +               flush_safesetid_whitelist_entries();
> > +               return len;
> > +       }
> > +
> > +       /*
> > +        * If we get to here, must be the case that file_entry->type equals
> > +        * SAFESETID_WHITELIST_ADD
>
> It seems a bit silly with only two options here, but it'll change for
> gids, yes? How about just building a switch() around file_entry->type
> instead and avoid needing to refactor this later?

Yes, there will likely be more entries when gids are introduced. Done.

>
> > +        */
> > +       ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> > +                                                        &child);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = add_safesetid_whitelist_entry(parent, child);
> > +       if (ret)
> > +               return ret;
> > +
> > +       /* Return len on success so caller won't keep trying to write */
> > +       return len;
> > +}
> > +
> > +static const struct file_operations safesetid_file_fops = {
> > +       .write = safesetid_file_write,
> > +};
> > +
> > +static void safesetid_shutdown_securityfs(void)
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +               struct safesetid_file_entry *entry =
> > +                       &safesetid_files[i];
> > +               securityfs_remove(entry->dentry);
> > +               entry->dentry = NULL;
> > +       }
> > +
> > +       securityfs_remove(safesetid_policy_dir);
> > +       safesetid_policy_dir = NULL;
> > +}
> > +
> > +static int __init safesetid_init_securityfs(void)
> > +{
> > +       int i;
> > +       int ret;
> > +
> > +       safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> > +       if (!safesetid_policy_dir) {
> > +               ret = PTR_ERR(safesetid_policy_dir);
> > +               goto error;
> > +       }
> > +
> > +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +               struct safesetid_file_entry *entry =
> > +                       &safesetid_files[i];
> > +               entry->dentry = securityfs_create_file(
> > +                       entry->name, 0200, safesetid_policy_dir,
> > +                       entry, &safesetid_file_fops);
> > +               if (IS_ERR(entry->dentry)) {
> > +                       ret = PTR_ERR(entry->dentry);
> > +                       goto error;
> > +               }
> > +       }
> > +
> > +       return 0;
> > +
> > +error:
> > +       safesetid_shutdown_securityfs();
> > +       return ret;
> > +}
> > +fs_initcall(safesetid_init_securityfs);
> > --
> > 2.20.1.97.g81188d93c3-goog
> >
>
> But overall, it looks good to me. :)
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2] LSM: add SafeSetID module that gates setid calls
  2019-01-15 19:49                                     ` [PATCH v2] " Micah Morton
@ 2019-01-15 19:53                                       ` Kees Cook
  0 siblings, 0 replies; 88+ messages in thread
From: Kees Cook @ 2019-01-15 19:53 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, Jan 15, 2019 at 11:49 AM Micah Morton <mortonm@chromium.org> wrote:
> On Mon, Jan 14, 2019 at 4:38 PM Kees Cook <keescook@chromium.org> wrote:
> > On Fri, Jan 11, 2019 at 9:13 AM <mortonm@chromium.org> wrote:
> > > From: Micah Morton <mortonm@chromium.org>
> > > +static void setuid_policy_violation(kuid_t parent, kuid_t child)
> > > +{
> > > +       pr_warn("UID transition (%d -> %d) blocked",
> > > +               __kuid_val(parent),
> > > +               __kuid_val(child));
> > > +        /*
> > > +         * Kill this process to avoid potential security vulnerabilities
> > > +         * that could arise from a missing whitelist entry preventing a
> > > +         * privileged process from dropping to a lesser-privileged one.
> > > +         */
> > > +        do_exit(SIGKILL);
> >
> > I think I asked earlier if this should be an unblockable signal raise
> > instead of a do_exit(). I don't remember if that got answered?
>
> Could you elaborate on this a bit, or share a pointer to some code/doc
> that explains the difference? I don't recall this point being raised
> before (might have missed it), and I'm no expert on the different
> approaches to killing a process in this way.

Sure! So, do_exit() is a big hammer, and skips a lot of clean-up-like
things (for example, this will skip core dumping, if it was desired by
the process, etc). It certainly has its place (and this may be it),
but I think it would be more sensible to use:

force_sig(SIGKILL, current);

and then the regular processing continues after this, and the kernel
will check for pending signals before returning to userspace. Though
please check this with strace to make sure the bad setuid() call never
returns... if it does, then I've got this wrong and do_exit() really
is appropriate here.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v4 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-15 19:44                                       ` Kees Cook
@ 2019-01-15 21:50                                         ` mortonm
  2019-01-15 22:32                                           ` Kees Cook
  2019-01-15 21:58                                         ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: mortonm @ 2019-01-15 21:50 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 25504 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
Changes since last patch set: More small mods suggested by Kees, also
changed the do_exit() operation in lsm.c to force_sig() instead.

 Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 security/Kconfig                            |   1 +
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  12 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 271 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  30 +++
 security/safesetid/securityfs.c             | 190 ++++++++++++++
 9 files changed, 621 insertions(+)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..ffb64be67f7a
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,107 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+The main use case for this LSM is to allow a non-root program to transition to
+other untrusted uids without full blown CAP_SETUID capabilities. The non-root
+program would still need CAP_SETUID to do any kind of transition, but the
+additional restrictions imposed by this LSM would mean it is a "safer" version
+of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
+do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
+namespace). The higher level goal is to allow for uid-based sandboxing of system
+services without having to give out CAP_SETUID all over the place just so that
+non-root programs can drop to even-lesser-privileged uids. This is especially
+relevant when one non-root daemon on the system should be allowed to spawn other
+processes as different uids, but its undesirable to give the daemon a
+basically-root-equivalent CAP_SETUID.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index 9842e21afd4a..a6ba95fbaa9f 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -46,3 +46,4 @@ subdirectories.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/security/Kconfig b/security/Kconfig
index 78dc12b7eeb3..9efc7a5e3280 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -236,6 +236,7 @@ source "security/tomoyo/Kconfig"
 source "security/apparmor/Kconfig"
 source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
+source "security/safesetid/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..c598b904938f 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..bf89a47ffcc8
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,12 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..c38cab263362
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,271 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      unsigned int opts)
+{
+	if (cap == CAP_SETUID &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		if (!(opts & CAP_OPT_INSETID)) {
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
+				__kuid_val(cred->uid));
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+	/*
+	 * Kill this process to avoid potential security vulnerabilities
+	 * that could arise from a missing whitelist entry preventing a
+	 * privileged process from dropping to a lesser-privileged one.
+	 */
+	force_sig(SIGKILL, current);
+	return -EACCES;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	default:
+		WARN_ON_ONCE("Unknown setid state %d\n", flags);
+		force_sig(SIGKILL, current);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	return 0;
+}
+
+DEFINE_LSM(safesetid_security_init) = {
+	.init = safesetid_security_init,
+};
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..bf78af9bf314
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..6c502f6d4fb0
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,190 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	switch (file_entry->type) {
+	case SAFESETID_WHITELIST_FLUSH:
+		flush_safesetid_whitelist_entries();
+		break;
+	case SAFESETID_WHITELIST_ADD:
+		ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+								 &child);
+		if (ret)
+			return ret;
+
+		ret = add_safesetid_whitelist_entry(parent, child);
+		if (ret)
+			return ret;
+		break;
+	default:
+		WARN_ON_ONCE("Unknown securityfs file %d\n", file_entry->type);
+		break;
+	}
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-15 19:44                                       ` Kees Cook
  2019-01-15 21:50                                         ` [PATCH v4 " mortonm
@ 2019-01-15 21:58                                         ` Micah Morton
  1 sibling, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-15 21:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, Jan 15, 2019 at 11:44 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Jan 15, 2019 at 10:04 AM <mortonm@chromium.org> wrote:
> >
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> > Changes since the last patch set: Pulled out the "no-op" changes that
> > mark setid call sites in kernel/sys.c into a separate patch, and made
> > other small mods proposed by Kees Cook. NOTE: this patch is still using
> > do_exit(SIGKILL) to kill the process in check_uid_transition in lsm.c.
> > This may need to change, pending further discussion.
> >  Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
> >  Documentation/admin-guide/LSM/index.rst     |   1 +
> >  security/Kconfig                            |   1 +
> >  security/Makefile                           |   2 +
> >  security/safesetid/Kconfig                  |  12 +
> >  security/safesetid/Makefile                 |   7 +
> >  security/safesetid/lsm.c                    | 266 ++++++++++++++++++++
> >  security/safesetid/lsm.h                    |  30 +++
> >  security/safesetid/securityfs.c             | 185 ++++++++++++++
> >  9 files changed, 611 insertions(+)
> >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >  create mode 100644 security/safesetid/Kconfig
> >  create mode 100644 security/safesetid/Makefile
> >  create mode 100644 security/safesetid/lsm.c
> >  create mode 100644 security/safesetid/lsm.h
> >  create mode 100644 security/safesetid/securityfs.c
> >
> > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > new file mode 100644
> > index 000000000000..ffb64be67f7a
> > --- /dev/null
> > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > @@ -0,0 +1,107 @@
> > +=========
> > +SafeSetID
> > +=========
> > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > +UID/GID transitions from a given UID/GID to only those approved by a
> > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > +allowing a user to set up user namespace UID mappings.
> > +
> > +
> > +Background
> > +==========
> > +In absence of file capabilities, processes spawned on a Linux system that need
> > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > +often preferable to use Linux runtime capabilities rather than file
> > +capabilities, since using file capabilities to run a program with elevated
> > +privileges opens up possible security holes since any user with access to the
> > +file can exec() that program to gain the elevated privileges.
> > +
> > +While it is possible to implement a tree of processes by giving full
> > +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> > +tree of processes under non-root user(s) in the first place. Specifically,
> > +since CAP_SETUID allows changing to any user on the system, including the root
> > +user, it is an overpowered capability for what is needed in this scenario,
> > +especially since programs often only call setuid() to drop privileges to a
> > +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> > +generally feasible way in Linux to restrict the potential UIDs that a user can
> > +switch to through setuid() beyond allowing a switch to any user on the system.
> > +This SafeSetID LSM seeks to provide a solution for restricting setid
> > +capabilities in such a way.
> > +
> > +The main use case for this LSM is to allow a non-root program to transition to
> > +other untrusted uids without full blown CAP_SETUID capabilities. The non-root
> > +program would still need CAP_SETUID to do any kind of transition, but the
> > +additional restrictions imposed by this LSM would mean it is a "safer" version
> > +of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
> > +do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
> > +namespace). The higher level goal is to allow for uid-based sandboxing of system
> > +services without having to give out CAP_SETUID all over the place just so that
> > +non-root programs can drop to even-lesser-privileged uids. This is especially
> > +relevant when one non-root daemon on the system should be allowed to spawn other
> > +processes as different uids, but its undesirable to give the daemon a
> > +basically-root-equivalent CAP_SETUID.
> > +
> > +
> > +Other Approaches Considered
> > +===========================
> > +
> > +Solve this problem in userspace
> > +-------------------------------
> > +For candidate applications that would like to have restricted setid capabilities
> > +as implemented in this LSM, an alternative option would be to simply take away
> > +setid capabilities from the application completely and refactor the process
> > +spawning semantics in the application (e.g. by using a privileged helper program
> > +to do process spawning and UID/GID transitions). Unfortunately, there are a
> > +number of semantics around process spawning that would be affected by this, such
> > +as fork() calls where the program doesn’t immediately call exec() after the
> > +fork(), parent processes specifying custom environment variables or command line
> > +args for spawned child processes, or inheritance of file handles across a
> > +fork()/exec(). Because of this, as solution that uses a privileged helper in
> > +userspace would likely be less appealing to incorporate into existing projects
> > +that rely on certain process-spawning semantics in Linux.
> > +
> > +Use user namespaces
> > +-------------------
> > +Another possible approach would be to run a given process tree in its own user
> > +namespace and give programs in the tree setid capabilities. In this way,
> > +programs in the tree could change to any desired UID/GID in the context of their
> > +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> > +initial system user namespace, affectively preventing privilege escalation.
> > +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> > +without pairing them with other namespace types, which is not always an option.
> > +Linux checks for capabilities based off of the user namespace that “owns” some
> > +entity. For example, Linux has the notion that network namespaces are owned by
> > +the user namespace in which they were created. A consequence of this is that
> > +capability checks for access to a given network namespace are done by checking
> > +whether a task has the given capability in the context of the user namespace
> > +that owns the network namespace -- not necessarily the user namespace under
> > +which the given task runs. Therefore spawning a process in a new user namespace
> > +effectively prevents it from accessing the network namespace owned by the
> > +initial namespace. This is a deal-breaker for any application that expects to
> > +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> > +configurations. Using user namespaces in isolation causes problems regarding
> > +other system interactions, including use of pid namespaces and device creation.
> > +
> > +Use an existing LSM
> > +-------------------
> > +None of the other in-tree LSMs have the capability to gate setid transitions, or
> > +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> > +"Since setuid only affects the current process, and since the SELinux controls
> > +are not based on the Linux identity attributes, SELinux does not need to control
> > +this operation."
> > +
> > +
> > +Directions for use
> > +==================
> > +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> > +applicable restriction policy is in place. Policies are configured through
> > +securityfs by writing to the safesetid/add_whitelist_policy and
> > +safesetid/flush_whitelist_policies files at the location where securityfs is
> > +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> > +numbers, such as '123:456'. To flush the policies, any write to the file is
> > +sufficient. Again, configuring a policy for a UID will prevent that UID from
> > +obtaining auxiliary setid privileges, such as allowing a user to set up user
> > +namespace UID mappings.
> > diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> > index 9842e21afd4a..a6ba95fbaa9f 100644
> > --- a/Documentation/admin-guide/LSM/index.rst
> > +++ b/Documentation/admin-guide/LSM/index.rst
> > @@ -46,3 +46,4 @@ subdirectories.
> >     Smack
> >     tomoyo
> >     Yama
> > +   SafeSetID
> > diff --git a/security/Kconfig b/security/Kconfig
> > index 78dc12b7eeb3..9efc7a5e3280 100644
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -236,6 +236,7 @@ source "security/tomoyo/Kconfig"
> >  source "security/apparmor/Kconfig"
> >  source "security/loadpin/Kconfig"
> >  source "security/yama/Kconfig"
> > +source "security/safesetid/Kconfig"
> >
> >  source "security/integrity/Kconfig"
> >
> > diff --git a/security/Makefile b/security/Makefile
> > index 4d2d3782ddef..c598b904938f 100644
> > --- a/security/Makefile
> > +++ b/security/Makefile
> > @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
> >  subdir-$(CONFIG_SECURITY_APPARMOR)     += apparmor
> >  subdir-$(CONFIG_SECURITY_YAMA)         += yama
> >  subdir-$(CONFIG_SECURITY_LOADPIN)      += loadpin
> > +subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
> >
> >  # always enable default capabilities
> >  obj-y                                  += commoncap.o
> > @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)         += tomoyo/
> >  obj-$(CONFIG_SECURITY_APPARMOR)                += apparmor/
> >  obj-$(CONFIG_SECURITY_YAMA)            += yama/
> >  obj-$(CONFIG_SECURITY_LOADPIN)         += loadpin/
> > +obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
> >  obj-$(CONFIG_CGROUP_DEVICE)            += device_cgroup.o
>
> Given the refactoring of the LSM enabling logic, you'll need to do
> some minor merging with the linux-next tree to get this to apply to
> security-next. That would make James's life easier, I think, though
> maybe James can speak to that, since I'm not sure how the trees are
> split right now.

These patches apply cleanly to security-next at the moment (unless I'm
doing something weird -- the last commit I see in the git log is mine
from last week: c1a85a00ea66cb6f0bd0f14e47c28c2b0999799f)

>
> >
> >  # Object integrity file lists
> > diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> > new file mode 100644
> > index 000000000000..bf89a47ffcc8
> > --- /dev/null
> > +++ b/security/safesetid/Kconfig
> > @@ -0,0 +1,12 @@
> > +config SECURITY_SAFESETID
> > +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> > +        default n
> > +        help
> > +          SafeSetID is an LSM module that gates the setid family of syscalls to
> > +          restrict UID/GID transitions from a given UID/GID to only those
> > +          approved by a system-wide whitelist. These restrictions also prohibit
> > +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> > +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> > +          UID mappings.
> > +
> > +          If you are unsure how to answer this question, answer N.
> > diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> > new file mode 100644
> > index 000000000000..6b0660321164
> > --- /dev/null
> > +++ b/security/safesetid/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the safesetid LSM.
> > +#
> > +
> > +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> > +safesetid-y := lsm.o securityfs.o
> > diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> > new file mode 100644
> > index 000000000000..aa7bd3323751
> > --- /dev/null
> > +++ b/security/safesetid/lsm.c
> > @@ -0,0 +1,266 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +
> > +#define pr_fmt(fmt) "SafeSetID: " fmt
> > +
> > +#include <asm/syscall.h>
> > +#include <linux/hashtable.h>
> > +#include <linux/lsm_hooks.h>
> > +#include <linux/module.h>
> > +#include <linux/ptrace.h>
> > +#include <linux/sched/task_stack.h>
> > +#include <linux/security.h>
> > +
> > +#define NUM_BITS 8 /* 128 buckets in hash table */
> > +
> > +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> > +
> > +/*
> > + * Hash table entry to store safesetid policy signifying that 'parent' user
> > + * can setid to 'child' user.
> > + */
> > +struct entry {
> > +       struct hlist_node next;
> > +       struct hlist_node dlist; /* for deletion cleanup */
> > +       uint64_t parent_kuid;
> > +       uint64_t child_kuid;
> > +};
> > +
> > +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> > +
> > +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> > +{
> > +       struct entry *entry;
> > +
> > +       rcu_read_lock();
> > +       hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > +                                  entry, next, __kuid_val(parent)) {
> > +               if (entry->parent_kuid == __kuid_val(parent)) {
> > +                       rcu_read_unlock();
> > +                       return true;
> > +               }
> > +       }
> > +       rcu_read_unlock();
> > +
> > +       return false;
> > +}
> > +
> > +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> > +                                                   kuid_t child)
> > +{
> > +       struct entry *entry;
> > +
> > +       rcu_read_lock();
> > +       hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > +                                  entry, next, __kuid_val(parent)) {
> > +               if (entry->parent_kuid == __kuid_val(parent) &&
> > +                   entry->child_kuid == __kuid_val(child)) {
> > +                       rcu_read_unlock();
> > +                       return true;
> > +               }
> > +       }
> > +       rcu_read_unlock();
> > +
> > +       return false;
> > +}
> > +
> > +static int safesetid_security_capable(const struct cred *cred,
> > +                                     struct user_namespace *ns,
> > +                                     int cap,
> > +                                     unsigned int opts)
> > +{
> > +       if (cap == CAP_SETUID &&
> > +           check_setuid_policy_hashtable_key(cred->uid)) {
> > +               if (!(opts & CAP_OPT_INSETID)) {
> > +                       /*
> > +                        * Deny if we're not in a set*uid() syscall to avoid
> > +                        * giving powers gated by CAP_SETUID that are related
> > +                        * to functionality other than calling set*uid() (e.g.
> > +                        * allowing user to set up userns uid mappings).
> > +                        */
> > +                       pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> > +                               __kuid_val(cred->uid));
> > +                       return -1;
> > +                }
> > +       }
> > +       return 0;
> > +}
> > +
> > +static int check_uid_transition(kuid_t parent, kuid_t child)
> > +{
> > +       if (check_setuid_policy_hashtable_key_value(parent, child))
> > +               return 0;
> > +       pr_warn("UID transition (%d -> %d) blocked",
> > +               __kuid_val(parent),
> > +               __kuid_val(child));
> > +        /*
> > +         * Kill this process to avoid potential security vulnerabilities
> > +         * that could arise from a missing whitelist entry preventing a
> > +         * privileged process from dropping to a lesser-privileged one.
> > +         */
> > +        do_exit(SIGKILL);
> > +}
>
> This needs double-checking, but I think you want this, to avoid
> missing various process clean-up steps (like performing a core dump if
> desired, etc):
>
> force_sig(SIGKILL, current);
> return -EACCES;
>
> But please double-check that a rejected setuid() syscall never
> completes and the process does die with SIGKILL.

Yep, this looks good. I changed those lines and see the following
strace output from a process that isn't allowed to setuid to root per
the whitelist policies:

...
setgid(0)                               = 0
setuid(0)                               = ?
+++ killed by SIGKILL +++

FWIW, I checked this with the following command on a ChromeOS device
in dev mode:

localhost ~ # strace -ff -o /tmp/strace /sbin/minijail0 -u shill -g
shill -c 0xc0 -- /sbin/capsh --user=root -- -c /usr/bin/whoami

>
> > +
> > +/*
> > + * Check whether there is either an exception for user under old cred struct to
> > + * set*uid to user under new cred struct, or the UID transition is allowed (by
> > + * Linux set*uid rules) even without CAP_SETUID.
> > + */
> > +static int safesetid_task_fix_setuid(struct cred *new,
> > +                                    const struct cred *old,
> > +                                    int flags)
> > +{
> > +
> > +       /* Do nothing if there are no setuid restrictions for this UID. */
> > +       if (!check_setuid_policy_hashtable_key(old->uid))
> > +               return 0;
> > +
> > +       switch (flags) {
> > +       case LSM_SETID_RE:
> > +               /*
> > +                * Users for which setuid restrictions exist can only set the
> > +                * real UID to the real UID or the effective UID, unless an
> > +                * explicit whitelist policy allows the transition.
> > +                */
> > +               if (!uid_eq(old->uid, new->uid) &&
> > +                       !uid_eq(old->euid, new->uid)) {
> > +                       return check_uid_transition(old->uid, new->uid);
> > +               }
> > +               /*
> > +                * Users for which setuid restrictions exist can only set the
> > +                * effective UID to the real UID, the effective UID, or the
> > +                * saved set-UID, unless an explicit whitelist policy allows
> > +                * the transition.
> > +                */
> > +               if (!uid_eq(old->uid, new->euid) &&
> > +                       !uid_eq(old->euid, new->euid) &&
> > +                       !uid_eq(old->suid, new->euid)) {
> > +                       return check_uid_transition(old->euid, new->euid);
> > +               }
> > +               break;
> > +       case LSM_SETID_ID:
> > +               /*
> > +                * Users for which setuid restrictions exist cannot change the
> > +                * real UID or saved set-UID unless an explicit whitelist
> > +                * policy allows the transition.
> > +                */
> > +               if (!uid_eq(old->uid, new->uid))
> > +                       return check_uid_transition(old->uid, new->uid);
> > +               if (!uid_eq(old->suid, new->suid))
> > +                       return check_uid_transition(old->suid, new->suid);
> > +               break;
> > +       case LSM_SETID_RES:
> > +               /*
> > +                * Users for which setuid restrictions exist cannot change the
> > +                * real UID, effective UID, or saved set-UID to anything but
> > +                * one of: the current real UID, the current effective UID or
> > +                * the current saved set-user-ID unless an explicit whitelist
> > +                * policy allows the transition.
> > +                */
> > +               if (!uid_eq(new->uid, old->uid) &&
> > +                       !uid_eq(new->uid, old->euid) &&
> > +                       !uid_eq(new->uid, old->suid)) {
> > +                       return check_uid_transition(old->uid, new->uid);
> > +               }
> > +               if (!uid_eq(new->euid, old->uid) &&
> > +                       !uid_eq(new->euid, old->euid) &&
> > +                       !uid_eq(new->euid, old->suid)) {
> > +                       return check_uid_transition(old->euid, new->euid);
> > +               }
> > +               if (!uid_eq(new->suid, old->uid) &&
> > +                       !uid_eq(new->suid, old->euid) &&
> > +                       !uid_eq(new->suid, old->suid)) {
> > +                       return check_uid_transition(old->suid, new->suid);
> > +               }
> > +               break;
> > +       case LSM_SETID_FS:
> > +               /*
> > +                * Users for which setuid restrictions exist cannot change the
> > +                * filesystem UID to anything but one of: the current real UID,
> > +                * the current effective UID or the current saved set-UID
> > +                * unless an explicit whitelist policy allows the transition.
> > +                */
> > +               if (!uid_eq(new->fsuid, old->uid)  &&
> > +                       !uid_eq(new->fsuid, old->euid)  &&
> > +                       !uid_eq(new->fsuid, old->suid) &&
> > +                       !uid_eq(new->fsuid, old->fsuid)) {
> > +                       return check_uid_transition(old->fsuid, new->fsuid);
> > +               }
> > +               break;
>
> As a robustness measure can you add a default case here that will
> "fail closed"? Something like:
>
> default:
>     WARN_ON_ONCE("Unknown setid state %d\n", flags);
>     force_sig(SIGKILL, current);
>     return -EINVAL;

Done.

>
> > +       }
> > +       return 0;
> > +}
> > +
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> > +{
> > +       struct entry *new;
> > +
> > +       /* Return if entry already exists */
> > +       if (check_setuid_policy_hashtable_key_value(parent, child))
> > +               return 0;
> > +
> > +       new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> > +       if (!new)
> > +               return -ENOMEM;
> > +       new->parent_kuid = __kuid_val(parent);
> > +       new->child_kuid = __kuid_val(child);
> > +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +       hash_add_rcu(safesetid_whitelist_hashtable,
> > +                    &new->next,
> > +                    __kuid_val(parent));
> > +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +       return 0;
> > +}
> > +
> > +void flush_safesetid_whitelist_entries(void)
> > +{
> > +       struct entry *entry;
> > +       struct hlist_node *hlist_node;
> > +       unsigned int bkt_loop_cursor;
> > +       HLIST_HEAD(free_list);
> > +
> > +       /*
> > +        * Could probably use hash_for_each_rcu here instead, but this should
> > +        * be fine as well.
> > +        */
> > +       spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +       hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> > +                          hlist_node, entry, next) {
> > +               hash_del_rcu(&entry->next);
> > +               hlist_add_head(&entry->dlist, &free_list);
> > +       }
> > +       spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +       synchronize_rcu();
> > +       hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> > +               hlist_del(&entry->dlist);
> > +               kfree(entry);
> > +       }
> > +}
> > +
> > +static struct security_hook_list safesetid_security_hooks[] = {
> > +       LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> > +       LSM_HOOK_INIT(capable, safesetid_security_capable)
> > +};
> > +
> > +static int __init safesetid_security_init(void)
> > +{
> > +       security_add_hooks(safesetid_security_hooks,
> > +                          ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> > +
> > +       return 0;
> > +}
> > +
> > +DEFINE_LSM(safesetid_security_init) = {
> > +       .init = safesetid_security_init,
> > +};
> > diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> > new file mode 100644
> > index 000000000000..bf78af9bf314
> > --- /dev/null
> > +++ b/security/safesetid/lsm.h
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#ifndef _SAFESETID_H
> > +#define _SAFESETID_H
> > +
> > +#include <linux/types.h>
> > +
> > +/* Function type. */
> > +enum safesetid_whitelist_file_write_type {
> > +       SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> > +       SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> > +};
> > +
> > +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> > +
> > +void flush_safesetid_whitelist_entries(void);
> > +
> > +#endif /* _SAFESETID_H */
> > diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> > new file mode 100644
> > index 000000000000..c3ce7b63b4af
> > --- /dev/null
> > +++ b/security/safesetid/securityfs.c
> > @@ -0,0 +1,185 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#include <linux/security.h>
> > +#include <linux/cred.h>
> > +
> > +#include "lsm.h"
> > +
> > +static struct dentry *safesetid_policy_dir;
> > +
> > +struct safesetid_file_entry {
> > +       const char *name;
> > +       enum safesetid_whitelist_file_write_type type;
> > +       struct dentry *dentry;
> > +};
> > +
> > +static struct safesetid_file_entry safesetid_files[] = {
> > +       {.name = "add_whitelist_policy",
> > +        .type = SAFESETID_WHITELIST_ADD},
> > +       {.name = "flush_whitelist_policies",
> > +        .type = SAFESETID_WHITELIST_FLUSH},
> > +};
> > +
> > +/*
> > + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> > + * variables pointed to by 'parent' and 'child' will get updated but this
> > + * function will return an error.
> > + */
> > +static int parse_safesetid_whitelist_policy(const char __user *buf,
> > +                                           size_t len,
> > +                                           kuid_t *parent,
> > +                                           kuid_t *child)
> > +{
> > +       char *kern_buf;
> > +       char *parent_buf;
> > +       char *child_buf;
> > +       const char separator[] = ":";
> > +       int ret;
> > +       size_t first_substring_length;
> > +       long parsed_parent;
> > +       long parsed_child;
> > +
> > +       /* Duplicate string from user memory and NULL-terminate */
> > +       kern_buf = memdup_user_nul(buf, len);
> > +       if (IS_ERR(kern_buf))
> > +               return PTR_ERR(kern_buf);
> > +
> > +       /*
> > +        * Format of |buf| string should be <UID>:<UID>.
> > +        * Find location of ":" in kern_buf (copied from |buf|).
> > +        */
> > +       first_substring_length = strcspn(kern_buf, separator);
> > +       if (first_substring_length == 0 || first_substring_length == len) {
> > +               ret = -EINVAL;
> > +               goto free_kern;
> > +       }
> > +
> > +       parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> > +       if (!parent_buf) {
> > +               ret = -ENOMEM;
> > +               goto free_kern;
> > +       }
> > +
> > +       ret = kstrtol(parent_buf, 0, &parsed_parent);
> > +       if (ret)
> > +               goto free_both;
> > +
> > +       child_buf = kern_buf + first_substring_length + 1;
> > +       ret = kstrtol(child_buf, 0, &parsed_child);
> > +       if (ret)
> > +               goto free_both;
> > +
> > +       *parent = make_kuid(current_user_ns(), parsed_parent);
> > +       if (!uid_valid(*parent)) {
> > +               ret = -EINVAL;
> > +               goto free_both;
> > +       }
> > +
> > +       *child = make_kuid(current_user_ns(), parsed_child);
> > +       if (!uid_valid(*child)) {
> > +               ret = -EINVAL;
> > +               goto free_both;
> > +       }
> > +
> > +free_both:
> > +       kfree(parent_buf);
> > +free_kern:
> > +       kfree(kern_buf);
> > +       return ret;
> > +}
> > +
> > +static ssize_t safesetid_file_write(struct file *file,
> > +                                   const char __user *buf,
> > +                                   size_t len,
> > +                                   loff_t *ppos)
> > +{
> > +       struct safesetid_file_entry *file_entry =
> > +               file->f_inode->i_private;
> > +       kuid_t parent;
> > +       kuid_t child;
> > +       int ret;
> > +
> > +       if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
> > +               return -EPERM;
> > +
> > +       if (*ppos != 0)
> > +               return -EINVAL;
> > +
> > +        switch (file_entry->type) {
> > +        case SAFESETID_WHITELIST_FLUSH:
> > +                flush_safesetid_whitelist_entries();
>
> missing break?

Thanks.

>
> > +        case SAFESETID_WHITELIST_ADD:
> > +                ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> > +                                                                 &child);
> > +                if (ret)
> > +                        return ret;
> > +
> > +                ret = add_safesetid_whitelist_entry(parent, child);
> > +                if (ret)
> > +                        return ret;
>
> And add a default here too, something like:
>
> default:
>     WARN_ON_ONCE("Unknown securityfs file %d!?\n", file_entry->type);
>     break;
>

Done.

> > +        }
> > +
> > +        /* Return len on success so caller won't keep trying to write */
> > +        return len;
> > +}
> > +
> > +static const struct file_operations safesetid_file_fops = {
> > +       .write = safesetid_file_write,
> > +};
> > +
> > +static void safesetid_shutdown_securityfs(void)
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +               struct safesetid_file_entry *entry =
> > +                       &safesetid_files[i];
> > +               securityfs_remove(entry->dentry);
> > +               entry->dentry = NULL;
> > +       }
> > +
> > +       securityfs_remove(safesetid_policy_dir);
> > +       safesetid_policy_dir = NULL;
> > +}
> > +
> > +static int __init safesetid_init_securityfs(void)
> > +{
> > +       int i;
> > +       int ret;
> > +
> > +       safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> > +       if (!safesetid_policy_dir) {
> > +               ret = PTR_ERR(safesetid_policy_dir);
> > +               goto error;
> > +       }
> > +
> > +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +               struct safesetid_file_entry *entry =
> > +                       &safesetid_files[i];
> > +               entry->dentry = securityfs_create_file(
> > +                       entry->name, 0200, safesetid_policy_dir,
> > +                       entry, &safesetid_file_fops);
> > +               if (IS_ERR(entry->dentry)) {
> > +                       ret = PTR_ERR(entry->dentry);
> > +                       goto error;
> > +               }
> > +       }
> > +
> > +       return 0;
> > +
> > +error:
> > +       safesetid_shutdown_securityfs();
> > +       return ret;
> > +}
> > +fs_initcall(safesetid_init_securityfs);
> > --
> > 2.20.1.97.g81188d93c3-goog
> >
>
> And if I didn't say it before, thank you for the docs on this too! :)
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v4 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-15 21:50                                         ` [PATCH v4 " mortonm
@ 2019-01-15 22:32                                           ` Kees Cook
  2019-01-16 15:46                                             ` [PATCH v5 " mortonm
  0 siblings, 1 reply; 88+ messages in thread
From: Kees Cook @ 2019-01-15 22:32 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, Jan 15, 2019 at 1:50 PM <mortonm@chromium.org> wrote:
> diff --git a/security/Kconfig b/security/Kconfig
> index 78dc12b7eeb3..9efc7a5e3280 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -236,6 +236,7 @@ source "security/tomoyo/Kconfig"
>  source "security/apparmor/Kconfig"
>  source "security/loadpin/Kconfig"
>  source "security/yama/Kconfig"
> +source "security/safesetid/Kconfig"
>
>  source "security/integrity/Kconfig"
>

In security-next, I'd expect "safesetid" to get added to "config LSM",
something like:

 config LSM
         string "Ordered list of enabled LSMs"
-        default "yama,loadpin,integrity,selinux,smack,tomoyo,apparmor"
+         default
"yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
       help
           A comma-separated list of LSMs, in initialization order.


> diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> new file mode 100644
> index 000000000000..c38cab263362
> --- /dev/null
> +++ b/security/safesetid/lsm.c
> [...]
> +static struct security_hook_list safesetid_security_hooks[] = {
> +       LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> +       LSM_HOOK_INIT(capable, safesetid_security_capable)
> +};
> +
> +static int __init safesetid_security_init(void)
> +{
> +       security_add_hooks(safesetid_security_hooks,
> +                          ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> +
> +       return 0;
> +}

I think you need to add an "did I get initialized?" variable for the
securityfs init to check (see security/apparmor/apparmorfs.c).

> diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> new file mode 100644
> index 000000000000..bf78af9bf314
> --- /dev/null
> +++ b/security/safesetid/lsm.h
> [...]
> +static int __init safesetid_init_securityfs(void)
> +{
> +       int i;
> +       int ret;

And the init check would go here to skip tree creation if safesetid
isn't running.

> +
> +       safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> +       if (!safesetid_policy_dir) {
> +               ret = PTR_ERR(safesetid_policy_dir);
> +               goto error;
> +       }
> +
> +       for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +               struct safesetid_file_entry *entry =
> +                       &safesetid_files[i];
> +               entry->dentry = securityfs_create_file(
> +                       entry->name, 0200, safesetid_policy_dir,
> +                       entry, &safesetid_file_fops);
> +               if (IS_ERR(entry->dentry)) {
> +                       ret = PTR_ERR(entry->dentry);
> +                       goto error;
> +               }
> +       }
> +
> +       return 0;
> +
> +error:
> +       safesetid_shutdown_securityfs();
> +       return ret;
> +}
> +fs_initcall(safesetid_init_securityfs);

After that, feel free to include:

Acked-by: Kees Cook <keescook@chromium.org>

Thanks for the updates!

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-15 22:32                                           ` Kees Cook
@ 2019-01-16 15:46                                             ` mortonm
  2019-01-16 16:10                                               ` Casey Schaufler
  2019-01-25 20:15                                               ` [PATCH v5 2/2] " James Morris
  0 siblings, 2 replies; 88+ messages in thread
From: mortonm @ 2019-01-16 15:46 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 26360 bytes --]

From: Micah Morton <mortonm@chromium.org>

SafeSetID gates the setid family of syscalls to restrict UID/GID
transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given
UIDs/GIDs from obtaining auxiliary privileges associated with
CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
mappings. For now, only gating the set*uid family of syscalls is
supported, with support for set*gid coming in a future patch set.

Signed-off-by: Micah Morton <mortonm@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
---
Changes since last patch:
  - added 'safesetid' to the ordered list of enabled LSMs in
    security/Kconfig.
  - added a "did I get initialized?" variable for the securityfs init to
    check and check that variable in securityfs.c to skip tree creation
    if safesetid isn't running
 Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
 Documentation/admin-guide/LSM/index.rst     |   1 +
 security/Kconfig                            |   3 +-
 security/Makefile                           |   2 +
 security/safesetid/Kconfig                  |  12 +
 security/safesetid/Makefile                 |   7 +
 security/safesetid/lsm.c                    | 277 ++++++++++++++++++++
 security/safesetid/lsm.h                    |  33 +++
 security/safesetid/securityfs.c             | 193 ++++++++++++++
 9 files changed, 634 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
 create mode 100644 security/safesetid/Kconfig
 create mode 100644 security/safesetid/Makefile
 create mode 100644 security/safesetid/lsm.c
 create mode 100644 security/safesetid/lsm.h
 create mode 100644 security/safesetid/securityfs.c

diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
new file mode 100644
index 000000000000..ffb64be67f7a
--- /dev/null
+++ b/Documentation/admin-guide/LSM/SafeSetID.rst
@@ -0,0 +1,107 @@
+=========
+SafeSetID
+=========
+SafeSetID is an LSM module that gates the setid family of syscalls to restrict
+UID/GID transitions from a given UID/GID to only those approved by a
+system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
+from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
+allowing a user to set up user namespace UID mappings.
+
+
+Background
+==========
+In absence of file capabilities, processes spawned on a Linux system that need
+to switch to a different user must be spawned with CAP_SETUID privileges.
+CAP_SETUID is granted to programs running as root or those running as a non-root
+user that have been explicitly given the CAP_SETUID runtime capability. It is
+often preferable to use Linux runtime capabilities rather than file
+capabilities, since using file capabilities to run a program with elevated
+privileges opens up possible security holes since any user with access to the
+file can exec() that program to gain the elevated privileges.
+
+While it is possible to implement a tree of processes by giving full
+CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
+tree of processes under non-root user(s) in the first place. Specifically,
+since CAP_SETUID allows changing to any user on the system, including the root
+user, it is an overpowered capability for what is needed in this scenario,
+especially since programs often only call setuid() to drop privileges to a
+lesser-privileged user -- not elevate privileges. Unfortunately, there is no
+generally feasible way in Linux to restrict the potential UIDs that a user can
+switch to through setuid() beyond allowing a switch to any user on the system.
+This SafeSetID LSM seeks to provide a solution for restricting setid
+capabilities in such a way.
+
+The main use case for this LSM is to allow a non-root program to transition to
+other untrusted uids without full blown CAP_SETUID capabilities. The non-root
+program would still need CAP_SETUID to do any kind of transition, but the
+additional restrictions imposed by this LSM would mean it is a "safer" version
+of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
+do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
+namespace). The higher level goal is to allow for uid-based sandboxing of system
+services without having to give out CAP_SETUID all over the place just so that
+non-root programs can drop to even-lesser-privileged uids. This is especially
+relevant when one non-root daemon on the system should be allowed to spawn other
+processes as different uids, but its undesirable to give the daemon a
+basically-root-equivalent CAP_SETUID.
+
+
+Other Approaches Considered
+===========================
+
+Solve this problem in userspace
+-------------------------------
+For candidate applications that would like to have restricted setid capabilities
+as implemented in this LSM, an alternative option would be to simply take away
+setid capabilities from the application completely and refactor the process
+spawning semantics in the application (e.g. by using a privileged helper program
+to do process spawning and UID/GID transitions). Unfortunately, there are a
+number of semantics around process spawning that would be affected by this, such
+as fork() calls where the program doesn’t immediately call exec() after the
+fork(), parent processes specifying custom environment variables or command line
+args for spawned child processes, or inheritance of file handles across a
+fork()/exec(). Because of this, as solution that uses a privileged helper in
+userspace would likely be less appealing to incorporate into existing projects
+that rely on certain process-spawning semantics in Linux.
+
+Use user namespaces
+-------------------
+Another possible approach would be to run a given process tree in its own user
+namespace and give programs in the tree setid capabilities. In this way,
+programs in the tree could change to any desired UID/GID in the context of their
+own user namespace, and only approved UIDs/GIDs could be mapped back to the
+initial system user namespace, affectively preventing privilege escalation.
+Unfortunately, it is not generally feasible to use user namespaces in isolation,
+without pairing them with other namespace types, which is not always an option.
+Linux checks for capabilities based off of the user namespace that “owns” some
+entity. For example, Linux has the notion that network namespaces are owned by
+the user namespace in which they were created. A consequence of this is that
+capability checks for access to a given network namespace are done by checking
+whether a task has the given capability in the context of the user namespace
+that owns the network namespace -- not necessarily the user namespace under
+which the given task runs. Therefore spawning a process in a new user namespace
+effectively prevents it from accessing the network namespace owned by the
+initial namespace. This is a deal-breaker for any application that expects to
+retain the CAP_NET_ADMIN capability for the purpose of adjusting network
+configurations. Using user namespaces in isolation causes problems regarding
+other system interactions, including use of pid namespaces and device creation.
+
+Use an existing LSM
+-------------------
+None of the other in-tree LSMs have the capability to gate setid transitions, or
+even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
+"Since setuid only affects the current process, and since the SELinux controls
+are not based on the Linux identity attributes, SELinux does not need to control
+this operation."
+
+
+Directions for use
+==================
+This LSM hooks the setid syscalls to make sure transitions are allowed if an
+applicable restriction policy is in place. Policies are configured through
+securityfs by writing to the safesetid/add_whitelist_policy and
+safesetid/flush_whitelist_policies files at the location where securityfs is
+mounted. The format for adding a policy is '<UID>:<UID>', using literal
+numbers, such as '123:456'. To flush the policies, any write to the file is
+sufficient. Again, configuring a policy for a UID will prevent that UID from
+obtaining auxiliary setid privileges, such as allowing a user to set up user
+namespace UID mappings.
diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
index 9842e21afd4a..a6ba95fbaa9f 100644
--- a/Documentation/admin-guide/LSM/index.rst
+++ b/Documentation/admin-guide/LSM/index.rst
@@ -46,3 +46,4 @@ subdirectories.
    Smack
    tomoyo
    Yama
+   SafeSetID
diff --git a/security/Kconfig b/security/Kconfig
index 78dc12b7eeb3..9555f4914492 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -236,12 +236,13 @@ source "security/tomoyo/Kconfig"
 source "security/apparmor/Kconfig"
 source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
+source "security/safesetid/Kconfig"
 
 source "security/integrity/Kconfig"
 
 config LSM
 	string "Ordered list of enabled LSMs"
-	default "yama,loadpin,integrity,selinux,smack,tomoyo,apparmor"
+	default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
 	help
 	  A comma-separated list of LSMs, in initialization order.
 	  Any LSMs left off this list will be ignored. This can be
diff --git a/security/Makefile b/security/Makefile
index 4d2d3782ddef..c598b904938f 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
 subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
+subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
 obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
 obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
+obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
new file mode 100644
index 000000000000..bf89a47ffcc8
--- /dev/null
+++ b/security/safesetid/Kconfig
@@ -0,0 +1,12 @@
+config SECURITY_SAFESETID
+        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
+        default n
+        help
+          SafeSetID is an LSM module that gates the setid family of syscalls to
+          restrict UID/GID transitions from a given UID/GID to only those
+          approved by a system-wide whitelist. These restrictions also prohibit
+          the given UIDs/GIDs from obtaining auxiliary privileges associated
+          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
+          UID mappings.
+
+          If you are unsure how to answer this question, answer N.
diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
new file mode 100644
index 000000000000..6b0660321164
--- /dev/null
+++ b/security/safesetid/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the safesetid LSM.
+#
+
+obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
+safesetid-y := lsm.o securityfs.o
diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
new file mode 100644
index 000000000000..3a2c75ac810c
--- /dev/null
+++ b/security/safesetid/lsm.c
@@ -0,0 +1,277 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#define pr_fmt(fmt) "SafeSetID: " fmt
+
+#include <asm/syscall.h>
+#include <linux/hashtable.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/ptrace.h>
+#include <linux/sched/task_stack.h>
+#include <linux/security.h>
+
+/* Flag indicating whether initialization completed */
+int safesetid_initialized;
+
+#define NUM_BITS 8 /* 128 buckets in hash table */
+
+static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
+
+/*
+ * Hash table entry to store safesetid policy signifying that 'parent' user
+ * can setid to 'child' user.
+ */
+struct entry {
+	struct hlist_node next;
+	struct hlist_node dlist; /* for deletion cleanup */
+	uint64_t parent_kuid;
+	uint64_t child_kuid;
+};
+
+static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
+
+static bool check_setuid_policy_hashtable_key(kuid_t parent)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
+						    kuid_t child)
+{
+	struct entry *entry;
+
+	rcu_read_lock();
+	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
+				   entry, next, __kuid_val(parent)) {
+		if (entry->parent_kuid == __kuid_val(parent) &&
+		    entry->child_kuid == __kuid_val(child)) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
+}
+
+static int safesetid_security_capable(const struct cred *cred,
+				      struct user_namespace *ns,
+				      int cap,
+				      unsigned int opts)
+{
+	if (cap == CAP_SETUID &&
+	    check_setuid_policy_hashtable_key(cred->uid)) {
+		if (!(opts & CAP_OPT_INSETID)) {
+			/*
+			 * Deny if we're not in a set*uid() syscall to avoid
+			 * giving powers gated by CAP_SETUID that are related
+			 * to functionality other than calling set*uid() (e.g.
+			 * allowing user to set up userns uid mappings).
+			 */
+			pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
+				__kuid_val(cred->uid));
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int check_uid_transition(kuid_t parent, kuid_t child)
+{
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+	pr_warn("UID transition (%d -> %d) blocked",
+		__kuid_val(parent),
+		__kuid_val(child));
+	/*
+	 * Kill this process to avoid potential security vulnerabilities
+	 * that could arise from a missing whitelist entry preventing a
+	 * privileged process from dropping to a lesser-privileged one.
+	 */
+	force_sig(SIGKILL, current);
+	return -EACCES;
+}
+
+/*
+ * Check whether there is either an exception for user under old cred struct to
+ * set*uid to user under new cred struct, or the UID transition is allowed (by
+ * Linux set*uid rules) even without CAP_SETUID.
+ */
+static int safesetid_task_fix_setuid(struct cred *new,
+				     const struct cred *old,
+				     int flags)
+{
+
+	/* Do nothing if there are no setuid restrictions for this UID. */
+	if (!check_setuid_policy_hashtable_key(old->uid))
+		return 0;
+
+	switch (flags) {
+	case LSM_SETID_RE:
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * real UID to the real UID or the effective UID, unless an
+		 * explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid) &&
+			!uid_eq(old->euid, new->uid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		/*
+		 * Users for which setuid restrictions exist can only set the
+		 * effective UID to the real UID, the effective UID, or the
+		 * saved set-UID, unless an explicit whitelist policy allows
+		 * the transition.
+		 */
+		if (!uid_eq(old->uid, new->euid) &&
+			!uid_eq(old->euid, new->euid) &&
+			!uid_eq(old->suid, new->euid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		break;
+	case LSM_SETID_ID:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID or saved set-UID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(old->uid, new->uid))
+			return check_uid_transition(old->uid, new->uid);
+		if (!uid_eq(old->suid, new->suid))
+			return check_uid_transition(old->suid, new->suid);
+		break;
+	case LSM_SETID_RES:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * real UID, effective UID, or saved set-UID to anything but
+		 * one of: the current real UID, the current effective UID or
+		 * the current saved set-user-ID unless an explicit whitelist
+		 * policy allows the transition.
+		 */
+		if (!uid_eq(new->uid, old->uid) &&
+			!uid_eq(new->uid, old->euid) &&
+			!uid_eq(new->uid, old->suid)) {
+			return check_uid_transition(old->uid, new->uid);
+		}
+		if (!uid_eq(new->euid, old->uid) &&
+			!uid_eq(new->euid, old->euid) &&
+			!uid_eq(new->euid, old->suid)) {
+			return check_uid_transition(old->euid, new->euid);
+		}
+		if (!uid_eq(new->suid, old->uid) &&
+			!uid_eq(new->suid, old->euid) &&
+			!uid_eq(new->suid, old->suid)) {
+			return check_uid_transition(old->suid, new->suid);
+		}
+		break;
+	case LSM_SETID_FS:
+		/*
+		 * Users for which setuid restrictions exist cannot change the
+		 * filesystem UID to anything but one of: the current real UID,
+		 * the current effective UID or the current saved set-UID
+		 * unless an explicit whitelist policy allows the transition.
+		 */
+		if (!uid_eq(new->fsuid, old->uid)  &&
+			!uid_eq(new->fsuid, old->euid)  &&
+			!uid_eq(new->fsuid, old->suid) &&
+			!uid_eq(new->fsuid, old->fsuid)) {
+			return check_uid_transition(old->fsuid, new->fsuid);
+		}
+		break;
+	default:
+		pr_warn("Unknown setid state %d\n", flags);
+		force_sig(SIGKILL, current);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
+{
+	struct entry *new;
+
+	/* Return if entry already exists */
+	if (check_setuid_policy_hashtable_key_value(parent, child))
+		return 0;
+
+	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	new->parent_kuid = __kuid_val(parent);
+	new->child_kuid = __kuid_val(child);
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_add_rcu(safesetid_whitelist_hashtable,
+		     &new->next,
+		     __kuid_val(parent));
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	return 0;
+}
+
+void flush_safesetid_whitelist_entries(void)
+{
+	struct entry *entry;
+	struct hlist_node *hlist_node;
+	unsigned int bkt_loop_cursor;
+	HLIST_HEAD(free_list);
+
+	/*
+	 * Could probably use hash_for_each_rcu here instead, but this should
+	 * be fine as well.
+	 */
+	spin_lock(&safesetid_whitelist_hashtable_spinlock);
+	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
+			   hlist_node, entry, next) {
+		hash_del_rcu(&entry->next);
+		hlist_add_head(&entry->dlist, &free_list);
+	}
+	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
+	synchronize_rcu();
+	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
+		hlist_del(&entry->dlist);
+		kfree(entry);
+	}
+}
+
+static struct security_hook_list safesetid_security_hooks[] = {
+	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
+	LSM_HOOK_INIT(capable, safesetid_security_capable)
+};
+
+static int __init safesetid_security_init(void)
+{
+	security_add_hooks(safesetid_security_hooks,
+			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
+
+	/* Report that SafeSetID successfully initialized */
+	safesetid_initialized = 1;
+
+	return 0;
+}
+
+DEFINE_LSM(safesetid_security_init) = {
+	.init = safesetid_security_init,
+};
diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
new file mode 100644
index 000000000000..c1ea3c265fcf
--- /dev/null
+++ b/security/safesetid/lsm.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _SAFESETID_H
+#define _SAFESETID_H
+
+#include <linux/types.h>
+
+/* Flag indicating whether initialization completed */
+extern int safesetid_initialized;
+
+/* Function type. */
+enum safesetid_whitelist_file_write_type {
+	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
+	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
+};
+
+/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
+int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
+
+void flush_safesetid_whitelist_entries(void);
+
+#endif /* _SAFESETID_H */
diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
new file mode 100644
index 000000000000..61be4ee459cc
--- /dev/null
+++ b/security/safesetid/securityfs.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SafeSetID Linux Security Module
+ *
+ * Author: Micah Morton <mortonm@chromium.org>
+ *
+ * Copyright (C) 2018 The Chromium OS Authors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ */
+#include <linux/security.h>
+#include <linux/cred.h>
+
+#include "lsm.h"
+
+static struct dentry *safesetid_policy_dir;
+
+struct safesetid_file_entry {
+	const char *name;
+	enum safesetid_whitelist_file_write_type type;
+	struct dentry *dentry;
+};
+
+static struct safesetid_file_entry safesetid_files[] = {
+	{.name = "add_whitelist_policy",
+	 .type = SAFESETID_WHITELIST_ADD},
+	{.name = "flush_whitelist_policies",
+	 .type = SAFESETID_WHITELIST_FLUSH},
+};
+
+/*
+ * In the case the input buffer contains one or more invalid UIDs, the kuid_t
+ * variables pointed to by 'parent' and 'child' will get updated but this
+ * function will return an error.
+ */
+static int parse_safesetid_whitelist_policy(const char __user *buf,
+					    size_t len,
+					    kuid_t *parent,
+					    kuid_t *child)
+{
+	char *kern_buf;
+	char *parent_buf;
+	char *child_buf;
+	const char separator[] = ":";
+	int ret;
+	size_t first_substring_length;
+	long parsed_parent;
+	long parsed_child;
+
+	/* Duplicate string from user memory and NULL-terminate */
+	kern_buf = memdup_user_nul(buf, len);
+	if (IS_ERR(kern_buf))
+		return PTR_ERR(kern_buf);
+
+	/*
+	 * Format of |buf| string should be <UID>:<UID>.
+	 * Find location of ":" in kern_buf (copied from |buf|).
+	 */
+	first_substring_length = strcspn(kern_buf, separator);
+	if (first_substring_length == 0 || first_substring_length == len) {
+		ret = -EINVAL;
+		goto free_kern;
+	}
+
+	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
+	if (!parent_buf) {
+		ret = -ENOMEM;
+		goto free_kern;
+	}
+
+	ret = kstrtol(parent_buf, 0, &parsed_parent);
+	if (ret)
+		goto free_both;
+
+	child_buf = kern_buf + first_substring_length + 1;
+	ret = kstrtol(child_buf, 0, &parsed_child);
+	if (ret)
+		goto free_both;
+
+	*parent = make_kuid(current_user_ns(), parsed_parent);
+	if (!uid_valid(*parent)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+	*child = make_kuid(current_user_ns(), parsed_child);
+	if (!uid_valid(*child)) {
+		ret = -EINVAL;
+		goto free_both;
+	}
+
+free_both:
+	kfree(parent_buf);
+free_kern:
+	kfree(kern_buf);
+	return ret;
+}
+
+static ssize_t safesetid_file_write(struct file *file,
+				    const char __user *buf,
+				    size_t len,
+				    loff_t *ppos)
+{
+	struct safesetid_file_entry *file_entry =
+		file->f_inode->i_private;
+	kuid_t parent;
+	kuid_t child;
+	int ret;
+
+	if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
+		return -EPERM;
+
+	if (*ppos != 0)
+		return -EINVAL;
+
+	switch (file_entry->type) {
+	case SAFESETID_WHITELIST_FLUSH:
+		flush_safesetid_whitelist_entries();
+		break;
+	case SAFESETID_WHITELIST_ADD:
+		ret = parse_safesetid_whitelist_policy(buf, len, &parent,
+								 &child);
+		if (ret)
+			return ret;
+
+		ret = add_safesetid_whitelist_entry(parent, child);
+		if (ret)
+			return ret;
+		break;
+	default:
+		pr_warn("Unknown securityfs file %d\n", file_entry->type);
+		break;
+	}
+
+	/* Return len on success so caller won't keep trying to write */
+	return len;
+}
+
+static const struct file_operations safesetid_file_fops = {
+	.write = safesetid_file_write,
+};
+
+static void safesetid_shutdown_securityfs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		securityfs_remove(entry->dentry);
+		entry->dentry = NULL;
+	}
+
+	securityfs_remove(safesetid_policy_dir);
+	safesetid_policy_dir = NULL;
+}
+
+static int __init safesetid_init_securityfs(void)
+{
+	int i;
+	int ret;
+
+	if (!safesetid_initialized)
+		return 0;
+
+	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
+	if (!safesetid_policy_dir) {
+		ret = PTR_ERR(safesetid_policy_dir);
+		goto error;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
+		struct safesetid_file_entry *entry =
+			&safesetid_files[i];
+		entry->dentry = securityfs_create_file(
+			entry->name, 0200, safesetid_policy_dir,
+			entry, &safesetid_file_fops);
+		if (IS_ERR(entry->dentry)) {
+			ret = PTR_ERR(entry->dentry);
+			goto error;
+		}
+	}
+
+	return 0;
+
+error:
+	safesetid_shutdown_securityfs();
+	return ret;
+}
+fs_initcall(safesetid_init_securityfs);
-- 
2.20.1.321.g9e740568ce-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-16 15:46                                             ` [PATCH v5 " mortonm
@ 2019-01-16 16:10                                               ` Casey Schaufler
  2019-01-22 20:40                                                 ` Micah Morton
  2019-01-25 20:15                                               ` [PATCH v5 2/2] " James Morris
  1 sibling, 1 reply; 88+ messages in thread
From: Casey Schaufler @ 2019-01-16 16:10 UTC (permalink / raw)
  To: mortonm, jmorris, serge, keescook, sds, linux-security-module

On 1/16/2019 7:46 AM, mortonm@chromium.org wrote:
> From: Micah Morton <mortonm@chromium.org>
>
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> Acked-by: Kees Cook <keescook@chromium.org>

While I have some lesser reservations philosophically, all
direct technical objections have been addressed. 

Acked-by: Casey Schaufler <casey@schaufler-ca.com>

> ---
> Changes since last patch:
>   - added 'safesetid' to the ordered list of enabled LSMs in
>     security/Kconfig.
>   - added a "did I get initialized?" variable for the securityfs init to
>     check and check that variable in securityfs.c to skip tree creation
>     if safesetid isn't running
>  Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
>  Documentation/admin-guide/LSM/index.rst     |   1 +
>  security/Kconfig                            |   3 +-
>  security/Makefile                           |   2 +
>  security/safesetid/Kconfig                  |  12 +
>  security/safesetid/Makefile                 |   7 +
>  security/safesetid/lsm.c                    | 277 ++++++++++++++++++++
>  security/safesetid/lsm.h                    |  33 +++
>  security/safesetid/securityfs.c             | 193 ++++++++++++++
>  9 files changed, 634 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
>  create mode 100644 security/safesetid/Kconfig
>  create mode 100644 security/safesetid/Makefile
>  create mode 100644 security/safesetid/lsm.c
>  create mode 100644 security/safesetid/lsm.h
>  create mode 100644 security/safesetid/securityfs.c
>
> diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> new file mode 100644
> index 000000000000..ffb64be67f7a
> --- /dev/null
> +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> @@ -0,0 +1,107 @@
> +=========
> +SafeSetID
> +=========
> +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> +UID/GID transitions from a given UID/GID to only those approved by a
> +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> +allowing a user to set up user namespace UID mappings.
> +
> +
> +Background
> +==========
> +In absence of file capabilities, processes spawned on a Linux system that need
> +to switch to a different user must be spawned with CAP_SETUID privileges.
> +CAP_SETUID is granted to programs running as root or those running as a non-root
> +user that have been explicitly given the CAP_SETUID runtime capability. It is
> +often preferable to use Linux runtime capabilities rather than file
> +capabilities, since using file capabilities to run a program with elevated
> +privileges opens up possible security holes since any user with access to the
> +file can exec() that program to gain the elevated privileges.
> +
> +While it is possible to implement a tree of processes by giving full
> +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> +tree of processes under non-root user(s) in the first place. Specifically,
> +since CAP_SETUID allows changing to any user on the system, including the root
> +user, it is an overpowered capability for what is needed in this scenario,
> +especially since programs often only call setuid() to drop privileges to a
> +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> +generally feasible way in Linux to restrict the potential UIDs that a user can
> +switch to through setuid() beyond allowing a switch to any user on the system.
> +This SafeSetID LSM seeks to provide a solution for restricting setid
> +capabilities in such a way.
> +
> +The main use case for this LSM is to allow a non-root program to transition to
> +other untrusted uids without full blown CAP_SETUID capabilities. The non-root
> +program would still need CAP_SETUID to do any kind of transition, but the
> +additional restrictions imposed by this LSM would mean it is a "safer" version
> +of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
> +do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
> +namespace). The higher level goal is to allow for uid-based sandboxing of system
> +services without having to give out CAP_SETUID all over the place just so that
> +non-root programs can drop to even-lesser-privileged uids. This is especially
> +relevant when one non-root daemon on the system should be allowed to spawn other
> +processes as different uids, but its undesirable to give the daemon a
> +basically-root-equivalent CAP_SETUID.
> +
> +
> +Other Approaches Considered
> +===========================
> +
> +Solve this problem in userspace
> +-------------------------------
> +For candidate applications that would like to have restricted setid capabilities
> +as implemented in this LSM, an alternative option would be to simply take away
> +setid capabilities from the application completely and refactor the process
> +spawning semantics in the application (e.g. by using a privileged helper program
> +to do process spawning and UID/GID transitions). Unfortunately, there are a
> +number of semantics around process spawning that would be affected by this, such
> +as fork() calls where the program doesn’t immediately call exec() after the
> +fork(), parent processes specifying custom environment variables or command line
> +args for spawned child processes, or inheritance of file handles across a
> +fork()/exec(). Because of this, as solution that uses a privileged helper in
> +userspace would likely be less appealing to incorporate into existing projects
> +that rely on certain process-spawning semantics in Linux.
> +
> +Use user namespaces
> +-------------------
> +Another possible approach would be to run a given process tree in its own user
> +namespace and give programs in the tree setid capabilities. In this way,
> +programs in the tree could change to any desired UID/GID in the context of their
> +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> +initial system user namespace, affectively preventing privilege escalation.
> +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> +without pairing them with other namespace types, which is not always an option.
> +Linux checks for capabilities based off of the user namespace that “owns” some
> +entity. For example, Linux has the notion that network namespaces are owned by
> +the user namespace in which they were created. A consequence of this is that
> +capability checks for access to a given network namespace are done by checking
> +whether a task has the given capability in the context of the user namespace
> +that owns the network namespace -- not necessarily the user namespace under
> +which the given task runs. Therefore spawning a process in a new user namespace
> +effectively prevents it from accessing the network namespace owned by the
> +initial namespace. This is a deal-breaker for any application that expects to
> +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> +configurations. Using user namespaces in isolation causes problems regarding
> +other system interactions, including use of pid namespaces and device creation.
> +
> +Use an existing LSM
> +-------------------
> +None of the other in-tree LSMs have the capability to gate setid transitions, or
> +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> +"Since setuid only affects the current process, and since the SELinux controls
> +are not based on the Linux identity attributes, SELinux does not need to control
> +this operation."
> +
> +
> +Directions for use
> +==================
> +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> +applicable restriction policy is in place. Policies are configured through
> +securityfs by writing to the safesetid/add_whitelist_policy and
> +safesetid/flush_whitelist_policies files at the location where securityfs is
> +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> +numbers, such as '123:456'. To flush the policies, any write to the file is
> +sufficient. Again, configuring a policy for a UID will prevent that UID from
> +obtaining auxiliary setid privileges, such as allowing a user to set up user
> +namespace UID mappings.
> diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> index 9842e21afd4a..a6ba95fbaa9f 100644
> --- a/Documentation/admin-guide/LSM/index.rst
> +++ b/Documentation/admin-guide/LSM/index.rst
> @@ -46,3 +46,4 @@ subdirectories.
>     Smack
>     tomoyo
>     Yama
> +   SafeSetID
> diff --git a/security/Kconfig b/security/Kconfig
> index 78dc12b7eeb3..9555f4914492 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -236,12 +236,13 @@ source "security/tomoyo/Kconfig"
>  source "security/apparmor/Kconfig"
>  source "security/loadpin/Kconfig"
>  source "security/yama/Kconfig"
> +source "security/safesetid/Kconfig"
>  
>  source "security/integrity/Kconfig"
>  
>  config LSM
>  	string "Ordered list of enabled LSMs"
> -	default "yama,loadpin,integrity,selinux,smack,tomoyo,apparmor"
> +	default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
>  	help
>  	  A comma-separated list of LSMs, in initialization order.
>  	  Any LSMs left off this list will be ignored. This can be
> diff --git a/security/Makefile b/security/Makefile
> index 4d2d3782ddef..c598b904938f 100644
> --- a/security/Makefile
> +++ b/security/Makefile
> @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
>  subdir-$(CONFIG_SECURITY_APPARMOR)	+= apparmor
>  subdir-$(CONFIG_SECURITY_YAMA)		+= yama
>  subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
> +subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
>  
>  # always enable default capabilities
>  obj-y					+= commoncap.o
> @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)		+= tomoyo/
>  obj-$(CONFIG_SECURITY_APPARMOR)		+= apparmor/
>  obj-$(CONFIG_SECURITY_YAMA)		+= yama/
>  obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
> +obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
>  obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
>  
>  # Object integrity file lists
> diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> new file mode 100644
> index 000000000000..bf89a47ffcc8
> --- /dev/null
> +++ b/security/safesetid/Kconfig
> @@ -0,0 +1,12 @@
> +config SECURITY_SAFESETID
> +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> +        default n
> +        help
> +          SafeSetID is an LSM module that gates the setid family of syscalls to
> +          restrict UID/GID transitions from a given UID/GID to only those
> +          approved by a system-wide whitelist. These restrictions also prohibit
> +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> +          UID mappings.
> +
> +          If you are unsure how to answer this question, answer N.
> diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> new file mode 100644
> index 000000000000..6b0660321164
> --- /dev/null
> +++ b/security/safesetid/Makefile
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Makefile for the safesetid LSM.
> +#
> +
> +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> +safesetid-y := lsm.o securityfs.o
> diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> new file mode 100644
> index 000000000000..3a2c75ac810c
> --- /dev/null
> +++ b/security/safesetid/lsm.c
> @@ -0,0 +1,277 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +
> +#define pr_fmt(fmt) "SafeSetID: " fmt
> +
> +#include <asm/syscall.h>
> +#include <linux/hashtable.h>
> +#include <linux/lsm_hooks.h>
> +#include <linux/module.h>
> +#include <linux/ptrace.h>
> +#include <linux/sched/task_stack.h>
> +#include <linux/security.h>
> +
> +/* Flag indicating whether initialization completed */
> +int safesetid_initialized;
> +
> +#define NUM_BITS 8 /* 128 buckets in hash table */
> +
> +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> +
> +/*
> + * Hash table entry to store safesetid policy signifying that 'parent' user
> + * can setid to 'child' user.
> + */
> +struct entry {
> +	struct hlist_node next;
> +	struct hlist_node dlist; /* for deletion cleanup */
> +	uint64_t parent_kuid;
> +	uint64_t child_kuid;
> +};
> +
> +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> +
> +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> +{
> +	struct entry *entry;
> +
> +	rcu_read_lock();
> +	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> +				   entry, next, __kuid_val(parent)) {
> +		if (entry->parent_kuid == __kuid_val(parent)) {
> +			rcu_read_unlock();
> +			return true;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> +
> +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> +						    kuid_t child)
> +{
> +	struct entry *entry;
> +
> +	rcu_read_lock();
> +	hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> +				   entry, next, __kuid_val(parent)) {
> +		if (entry->parent_kuid == __kuid_val(parent) &&
> +		    entry->child_kuid == __kuid_val(child)) {
> +			rcu_read_unlock();
> +			return true;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	return false;
> +}
> +
> +static int safesetid_security_capable(const struct cred *cred,
> +				      struct user_namespace *ns,
> +				      int cap,
> +				      unsigned int opts)
> +{
> +	if (cap == CAP_SETUID &&
> +	    check_setuid_policy_hashtable_key(cred->uid)) {
> +		if (!(opts & CAP_OPT_INSETID)) {
> +			/*
> +			 * Deny if we're not in a set*uid() syscall to avoid
> +			 * giving powers gated by CAP_SETUID that are related
> +			 * to functionality other than calling set*uid() (e.g.
> +			 * allowing user to set up userns uid mappings).
> +			 */
> +			pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> +				__kuid_val(cred->uid));
> +			return -1;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int check_uid_transition(kuid_t parent, kuid_t child)
> +{
> +	if (check_setuid_policy_hashtable_key_value(parent, child))
> +		return 0;
> +	pr_warn("UID transition (%d -> %d) blocked",
> +		__kuid_val(parent),
> +		__kuid_val(child));
> +	/*
> +	 * Kill this process to avoid potential security vulnerabilities
> +	 * that could arise from a missing whitelist entry preventing a
> +	 * privileged process from dropping to a lesser-privileged one.
> +	 */
> +	force_sig(SIGKILL, current);
> +	return -EACCES;
> +}
> +
> +/*
> + * Check whether there is either an exception for user under old cred struct to
> + * set*uid to user under new cred struct, or the UID transition is allowed (by
> + * Linux set*uid rules) even without CAP_SETUID.
> + */
> +static int safesetid_task_fix_setuid(struct cred *new,
> +				     const struct cred *old,
> +				     int flags)
> +{
> +
> +	/* Do nothing if there are no setuid restrictions for this UID. */
> +	if (!check_setuid_policy_hashtable_key(old->uid))
> +		return 0;
> +
> +	switch (flags) {
> +	case LSM_SETID_RE:
> +		/*
> +		 * Users for which setuid restrictions exist can only set the
> +		 * real UID to the real UID or the effective UID, unless an
> +		 * explicit whitelist policy allows the transition.
> +		 */
> +		if (!uid_eq(old->uid, new->uid) &&
> +			!uid_eq(old->euid, new->uid)) {
> +			return check_uid_transition(old->uid, new->uid);
> +		}
> +		/*
> +		 * Users for which setuid restrictions exist can only set the
> +		 * effective UID to the real UID, the effective UID, or the
> +		 * saved set-UID, unless an explicit whitelist policy allows
> +		 * the transition.
> +		 */
> +		if (!uid_eq(old->uid, new->euid) &&
> +			!uid_eq(old->euid, new->euid) &&
> +			!uid_eq(old->suid, new->euid)) {
> +			return check_uid_transition(old->euid, new->euid);
> +		}
> +		break;
> +	case LSM_SETID_ID:
> +		/*
> +		 * Users for which setuid restrictions exist cannot change the
> +		 * real UID or saved set-UID unless an explicit whitelist
> +		 * policy allows the transition.
> +		 */
> +		if (!uid_eq(old->uid, new->uid))
> +			return check_uid_transition(old->uid, new->uid);
> +		if (!uid_eq(old->suid, new->suid))
> +			return check_uid_transition(old->suid, new->suid);
> +		break;
> +	case LSM_SETID_RES:
> +		/*
> +		 * Users for which setuid restrictions exist cannot change the
> +		 * real UID, effective UID, or saved set-UID to anything but
> +		 * one of: the current real UID, the current effective UID or
> +		 * the current saved set-user-ID unless an explicit whitelist
> +		 * policy allows the transition.
> +		 */
> +		if (!uid_eq(new->uid, old->uid) &&
> +			!uid_eq(new->uid, old->euid) &&
> +			!uid_eq(new->uid, old->suid)) {
> +			return check_uid_transition(old->uid, new->uid);
> +		}
> +		if (!uid_eq(new->euid, old->uid) &&
> +			!uid_eq(new->euid, old->euid) &&
> +			!uid_eq(new->euid, old->suid)) {
> +			return check_uid_transition(old->euid, new->euid);
> +		}
> +		if (!uid_eq(new->suid, old->uid) &&
> +			!uid_eq(new->suid, old->euid) &&
> +			!uid_eq(new->suid, old->suid)) {
> +			return check_uid_transition(old->suid, new->suid);
> +		}
> +		break;
> +	case LSM_SETID_FS:
> +		/*
> +		 * Users for which setuid restrictions exist cannot change the
> +		 * filesystem UID to anything but one of: the current real UID,
> +		 * the current effective UID or the current saved set-UID
> +		 * unless an explicit whitelist policy allows the transition.
> +		 */
> +		if (!uid_eq(new->fsuid, old->uid)  &&
> +			!uid_eq(new->fsuid, old->euid)  &&
> +			!uid_eq(new->fsuid, old->suid) &&
> +			!uid_eq(new->fsuid, old->fsuid)) {
> +			return check_uid_transition(old->fsuid, new->fsuid);
> +		}
> +		break;
> +	default:
> +		pr_warn("Unknown setid state %d\n", flags);
> +		force_sig(SIGKILL, current);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> +{
> +	struct entry *new;
> +
> +	/* Return if entry already exists */
> +	if (check_setuid_policy_hashtable_key_value(parent, child))
> +		return 0;
> +
> +	new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> +	if (!new)
> +		return -ENOMEM;
> +	new->parent_kuid = __kuid_val(parent);
> +	new->child_kuid = __kuid_val(child);
> +	spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +	hash_add_rcu(safesetid_whitelist_hashtable,
> +		     &new->next,
> +		     __kuid_val(parent));
> +	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +	return 0;
> +}
> +
> +void flush_safesetid_whitelist_entries(void)
> +{
> +	struct entry *entry;
> +	struct hlist_node *hlist_node;
> +	unsigned int bkt_loop_cursor;
> +	HLIST_HEAD(free_list);
> +
> +	/*
> +	 * Could probably use hash_for_each_rcu here instead, but this should
> +	 * be fine as well.
> +	 */
> +	spin_lock(&safesetid_whitelist_hashtable_spinlock);
> +	hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> +			   hlist_node, entry, next) {
> +		hash_del_rcu(&entry->next);
> +		hlist_add_head(&entry->dlist, &free_list);
> +	}
> +	spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> +	synchronize_rcu();
> +	hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> +		hlist_del(&entry->dlist);
> +		kfree(entry);
> +	}
> +}
> +
> +static struct security_hook_list safesetid_security_hooks[] = {
> +	LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> +	LSM_HOOK_INIT(capable, safesetid_security_capable)
> +};
> +
> +static int __init safesetid_security_init(void)
> +{
> +	security_add_hooks(safesetid_security_hooks,
> +			   ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> +
> +	/* Report that SafeSetID successfully initialized */
> +	safesetid_initialized = 1;
> +
> +	return 0;
> +}
> +
> +DEFINE_LSM(safesetid_security_init) = {
> +	.init = safesetid_security_init,
> +};
> diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> new file mode 100644
> index 000000000000..c1ea3c265fcf
> --- /dev/null
> +++ b/security/safesetid/lsm.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#ifndef _SAFESETID_H
> +#define _SAFESETID_H
> +
> +#include <linux/types.h>
> +
> +/* Flag indicating whether initialization completed */
> +extern int safesetid_initialized;
> +
> +/* Function type. */
> +enum safesetid_whitelist_file_write_type {
> +	SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> +	SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> +};
> +
> +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> +
> +void flush_safesetid_whitelist_entries(void);
> +
> +#endif /* _SAFESETID_H */
> diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> new file mode 100644
> index 000000000000..61be4ee459cc
> --- /dev/null
> +++ b/security/safesetid/securityfs.c
> @@ -0,0 +1,193 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SafeSetID Linux Security Module
> + *
> + * Author: Micah Morton <mortonm@chromium.org>
> + *
> + * Copyright (C) 2018 The Chromium OS Authors.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2, as
> + * published by the Free Software Foundation.
> + *
> + */
> +#include <linux/security.h>
> +#include <linux/cred.h>
> +
> +#include "lsm.h"
> +
> +static struct dentry *safesetid_policy_dir;
> +
> +struct safesetid_file_entry {
> +	const char *name;
> +	enum safesetid_whitelist_file_write_type type;
> +	struct dentry *dentry;
> +};
> +
> +static struct safesetid_file_entry safesetid_files[] = {
> +	{.name = "add_whitelist_policy",
> +	 .type = SAFESETID_WHITELIST_ADD},
> +	{.name = "flush_whitelist_policies",
> +	 .type = SAFESETID_WHITELIST_FLUSH},
> +};
> +
> +/*
> + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> + * variables pointed to by 'parent' and 'child' will get updated but this
> + * function will return an error.
> + */
> +static int parse_safesetid_whitelist_policy(const char __user *buf,
> +					    size_t len,
> +					    kuid_t *parent,
> +					    kuid_t *child)
> +{
> +	char *kern_buf;
> +	char *parent_buf;
> +	char *child_buf;
> +	const char separator[] = ":";
> +	int ret;
> +	size_t first_substring_length;
> +	long parsed_parent;
> +	long parsed_child;
> +
> +	/* Duplicate string from user memory and NULL-terminate */
> +	kern_buf = memdup_user_nul(buf, len);
> +	if (IS_ERR(kern_buf))
> +		return PTR_ERR(kern_buf);
> +
> +	/*
> +	 * Format of |buf| string should be <UID>:<UID>.
> +	 * Find location of ":" in kern_buf (copied from |buf|).
> +	 */
> +	first_substring_length = strcspn(kern_buf, separator);
> +	if (first_substring_length == 0 || first_substring_length == len) {
> +		ret = -EINVAL;
> +		goto free_kern;
> +	}
> +
> +	parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> +	if (!parent_buf) {
> +		ret = -ENOMEM;
> +		goto free_kern;
> +	}
> +
> +	ret = kstrtol(parent_buf, 0, &parsed_parent);
> +	if (ret)
> +		goto free_both;
> +
> +	child_buf = kern_buf + first_substring_length + 1;
> +	ret = kstrtol(child_buf, 0, &parsed_child);
> +	if (ret)
> +		goto free_both;
> +
> +	*parent = make_kuid(current_user_ns(), parsed_parent);
> +	if (!uid_valid(*parent)) {
> +		ret = -EINVAL;
> +		goto free_both;
> +	}
> +
> +	*child = make_kuid(current_user_ns(), parsed_child);
> +	if (!uid_valid(*child)) {
> +		ret = -EINVAL;
> +		goto free_both;
> +	}
> +
> +free_both:
> +	kfree(parent_buf);
> +free_kern:
> +	kfree(kern_buf);
> +	return ret;
> +}
> +
> +static ssize_t safesetid_file_write(struct file *file,
> +				    const char __user *buf,
> +				    size_t len,
> +				    loff_t *ppos)
> +{
> +	struct safesetid_file_entry *file_entry =
> +		file->f_inode->i_private;
> +	kuid_t parent;
> +	kuid_t child;
> +	int ret;
> +
> +	if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
> +		return -EPERM;
> +
> +	if (*ppos != 0)
> +		return -EINVAL;
> +
> +	switch (file_entry->type) {
> +	case SAFESETID_WHITELIST_FLUSH:
> +		flush_safesetid_whitelist_entries();
> +		break;
> +	case SAFESETID_WHITELIST_ADD:
> +		ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> +								 &child);
> +		if (ret)
> +			return ret;
> +
> +		ret = add_safesetid_whitelist_entry(parent, child);
> +		if (ret)
> +			return ret;
> +		break;
> +	default:
> +		pr_warn("Unknown securityfs file %d\n", file_entry->type);
> +		break;
> +	}
> +
> +	/* Return len on success so caller won't keep trying to write */
> +	return len;
> +}
> +
> +static const struct file_operations safesetid_file_fops = {
> +	.write = safesetid_file_write,
> +};
> +
> +static void safesetid_shutdown_securityfs(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +		struct safesetid_file_entry *entry =
> +			&safesetid_files[i];
> +		securityfs_remove(entry->dentry);
> +		entry->dentry = NULL;
> +	}
> +
> +	securityfs_remove(safesetid_policy_dir);
> +	safesetid_policy_dir = NULL;
> +}
> +
> +static int __init safesetid_init_securityfs(void)
> +{
> +	int i;
> +	int ret;
> +
> +	if (!safesetid_initialized)
> +		return 0;
> +
> +	safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> +	if (!safesetid_policy_dir) {
> +		ret = PTR_ERR(safesetid_policy_dir);
> +		goto error;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> +		struct safesetid_file_entry *entry =
> +			&safesetid_files[i];
> +		entry->dentry = securityfs_create_file(
> +			entry->name, 0200, safesetid_policy_dir,
> +			entry, &safesetid_file_fops);
> +		if (IS_ERR(entry->dentry)) {
> +			ret = PTR_ERR(entry->dentry);
> +			goto error;
> +		}
> +	}
> +
> +	return 0;
> +
> +error:
> +	safesetid_shutdown_securityfs();
> +	return ret;
> +}
> +fs_initcall(safesetid_init_securityfs);

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-16 16:10                                               ` Casey Schaufler
@ 2019-01-22 20:40                                                 ` Micah Morton
  2019-01-22 22:28                                                   ` James Morris
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-22 20:40 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: James Morris, Serge E. Hallyn, Kees Cook, Stephen Smalley,
	linux-security-module

This has been Acked by Kees and Casey so far. Any further comments on
this? If not, should be ready to merge?


On Wed, Jan 16, 2019 at 8:10 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> On 1/16/2019 7:46 AM, mortonm@chromium.org wrote:
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > Acked-by: Kees Cook <keescook@chromium.org>
>
> While I have some lesser reservations philosophically, all
> direct technical objections have been addressed.
>
> Acked-by: Casey Schaufler <casey@schaufler-ca.com>
>
> > ---
> > Changes since last patch:
> >   - added 'safesetid' to the ordered list of enabled LSMs in
> >     security/Kconfig.
> >   - added a "did I get initialized?" variable for the securityfs init to
> >     check and check that variable in securityfs.c to skip tree creation
> >     if safesetid isn't running
> >  Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
> >  Documentation/admin-guide/LSM/index.rst     |   1 +
> >  security/Kconfig                            |   3 +-
> >  security/Makefile                           |   2 +
> >  security/safesetid/Kconfig                  |  12 +
> >  security/safesetid/Makefile                 |   7 +
> >  security/safesetid/lsm.c                    | 277 ++++++++++++++++++++
> >  security/safesetid/lsm.h                    |  33 +++
> >  security/safesetid/securityfs.c             | 193 ++++++++++++++
> >  9 files changed, 634 insertions(+), 1 deletion(-)
> >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> >  create mode 100644 security/safesetid/Kconfig
> >  create mode 100644 security/safesetid/Makefile
> >  create mode 100644 security/safesetid/lsm.c
> >  create mode 100644 security/safesetid/lsm.h
> >  create mode 100644 security/safesetid/securityfs.c
> >
> > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > new file mode 100644
> > index 000000000000..ffb64be67f7a
> > --- /dev/null
> > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > @@ -0,0 +1,107 @@
> > +=========
> > +SafeSetID
> > +=========
> > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > +UID/GID transitions from a given UID/GID to only those approved by a
> > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > +allowing a user to set up user namespace UID mappings.
> > +
> > +
> > +Background
> > +==========
> > +In absence of file capabilities, processes spawned on a Linux system that need
> > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > +often preferable to use Linux runtime capabilities rather than file
> > +capabilities, since using file capabilities to run a program with elevated
> > +privileges opens up possible security holes since any user with access to the
> > +file can exec() that program to gain the elevated privileges.
> > +
> > +While it is possible to implement a tree of processes by giving full
> > +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> > +tree of processes under non-root user(s) in the first place. Specifically,
> > +since CAP_SETUID allows changing to any user on the system, including the root
> > +user, it is an overpowered capability for what is needed in this scenario,
> > +especially since programs often only call setuid() to drop privileges to a
> > +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> > +generally feasible way in Linux to restrict the potential UIDs that a user can
> > +switch to through setuid() beyond allowing a switch to any user on the system.
> > +This SafeSetID LSM seeks to provide a solution for restricting setid
> > +capabilities in such a way.
> > +
> > +The main use case for this LSM is to allow a non-root program to transition to
> > +other untrusted uids without full blown CAP_SETUID capabilities. The non-root
> > +program would still need CAP_SETUID to do any kind of transition, but the
> > +additional restrictions imposed by this LSM would mean it is a "safer" version
> > +of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
> > +do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
> > +namespace). The higher level goal is to allow for uid-based sandboxing of system
> > +services without having to give out CAP_SETUID all over the place just so that
> > +non-root programs can drop to even-lesser-privileged uids. This is especially
> > +relevant when one non-root daemon on the system should be allowed to spawn other
> > +processes as different uids, but its undesirable to give the daemon a
> > +basically-root-equivalent CAP_SETUID.
> > +
> > +
> > +Other Approaches Considered
> > +===========================
> > +
> > +Solve this problem in userspace
> > +-------------------------------
> > +For candidate applications that would like to have restricted setid capabilities
> > +as implemented in this LSM, an alternative option would be to simply take away
> > +setid capabilities from the application completely and refactor the process
> > +spawning semantics in the application (e.g. by using a privileged helper program
> > +to do process spawning and UID/GID transitions). Unfortunately, there are a
> > +number of semantics around process spawning that would be affected by this, such
> > +as fork() calls where the program doesn’t immediately call exec() after the
> > +fork(), parent processes specifying custom environment variables or command line
> > +args for spawned child processes, or inheritance of file handles across a
> > +fork()/exec(). Because of this, as solution that uses a privileged helper in
> > +userspace would likely be less appealing to incorporate into existing projects
> > +that rely on certain process-spawning semantics in Linux.
> > +
> > +Use user namespaces
> > +-------------------
> > +Another possible approach would be to run a given process tree in its own user
> > +namespace and give programs in the tree setid capabilities. In this way,
> > +programs in the tree could change to any desired UID/GID in the context of their
> > +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> > +initial system user namespace, affectively preventing privilege escalation.
> > +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> > +without pairing them with other namespace types, which is not always an option.
> > +Linux checks for capabilities based off of the user namespace that “owns” some
> > +entity. For example, Linux has the notion that network namespaces are owned by
> > +the user namespace in which they were created. A consequence of this is that
> > +capability checks for access to a given network namespace are done by checking
> > +whether a task has the given capability in the context of the user namespace
> > +that owns the network namespace -- not necessarily the user namespace under
> > +which the given task runs. Therefore spawning a process in a new user namespace
> > +effectively prevents it from accessing the network namespace owned by the
> > +initial namespace. This is a deal-breaker for any application that expects to
> > +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> > +configurations. Using user namespaces in isolation causes problems regarding
> > +other system interactions, including use of pid namespaces and device creation.
> > +
> > +Use an existing LSM
> > +-------------------
> > +None of the other in-tree LSMs have the capability to gate setid transitions, or
> > +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> > +"Since setuid only affects the current process, and since the SELinux controls
> > +are not based on the Linux identity attributes, SELinux does not need to control
> > +this operation."
> > +
> > +
> > +Directions for use
> > +==================
> > +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> > +applicable restriction policy is in place. Policies are configured through
> > +securityfs by writing to the safesetid/add_whitelist_policy and
> > +safesetid/flush_whitelist_policies files at the location where securityfs is
> > +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> > +numbers, such as '123:456'. To flush the policies, any write to the file is
> > +sufficient. Again, configuring a policy for a UID will prevent that UID from
> > +obtaining auxiliary setid privileges, such as allowing a user to set up user
> > +namespace UID mappings.
> > diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> > index 9842e21afd4a..a6ba95fbaa9f 100644
> > --- a/Documentation/admin-guide/LSM/index.rst
> > +++ b/Documentation/admin-guide/LSM/index.rst
> > @@ -46,3 +46,4 @@ subdirectories.
> >     Smack
> >     tomoyo
> >     Yama
> > +   SafeSetID
> > diff --git a/security/Kconfig b/security/Kconfig
> > index 78dc12b7eeb3..9555f4914492 100644
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -236,12 +236,13 @@ source "security/tomoyo/Kconfig"
> >  source "security/apparmor/Kconfig"
> >  source "security/loadpin/Kconfig"
> >  source "security/yama/Kconfig"
> > +source "security/safesetid/Kconfig"
> >
> >  source "security/integrity/Kconfig"
> >
> >  config LSM
> >       string "Ordered list of enabled LSMs"
> > -     default "yama,loadpin,integrity,selinux,smack,tomoyo,apparmor"
> > +     default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
> >       help
> >         A comma-separated list of LSMs, in initialization order.
> >         Any LSMs left off this list will be ignored. This can be
> > diff --git a/security/Makefile b/security/Makefile
> > index 4d2d3782ddef..c598b904938f 100644
> > --- a/security/Makefile
> > +++ b/security/Makefile
> > @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
> >  subdir-$(CONFIG_SECURITY_APPARMOR)   += apparmor
> >  subdir-$(CONFIG_SECURITY_YAMA)               += yama
> >  subdir-$(CONFIG_SECURITY_LOADPIN)    += loadpin
> > +subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
> >
> >  # always enable default capabilities
> >  obj-y                                        += commoncap.o
> > @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)               += tomoyo/
> >  obj-$(CONFIG_SECURITY_APPARMOR)              += apparmor/
> >  obj-$(CONFIG_SECURITY_YAMA)          += yama/
> >  obj-$(CONFIG_SECURITY_LOADPIN)               += loadpin/
> > +obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
> >  obj-$(CONFIG_CGROUP_DEVICE)          += device_cgroup.o
> >
> >  # Object integrity file lists
> > diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> > new file mode 100644
> > index 000000000000..bf89a47ffcc8
> > --- /dev/null
> > +++ b/security/safesetid/Kconfig
> > @@ -0,0 +1,12 @@
> > +config SECURITY_SAFESETID
> > +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> > +        default n
> > +        help
> > +          SafeSetID is an LSM module that gates the setid family of syscalls to
> > +          restrict UID/GID transitions from a given UID/GID to only those
> > +          approved by a system-wide whitelist. These restrictions also prohibit
> > +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> > +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> > +          UID mappings.
> > +
> > +          If you are unsure how to answer this question, answer N.
> > diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> > new file mode 100644
> > index 000000000000..6b0660321164
> > --- /dev/null
> > +++ b/security/safesetid/Makefile
> > @@ -0,0 +1,7 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the safesetid LSM.
> > +#
> > +
> > +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> > +safesetid-y := lsm.o securityfs.o
> > diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> > new file mode 100644
> > index 000000000000..3a2c75ac810c
> > --- /dev/null
> > +++ b/security/safesetid/lsm.c
> > @@ -0,0 +1,277 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +
> > +#define pr_fmt(fmt) "SafeSetID: " fmt
> > +
> > +#include <asm/syscall.h>
> > +#include <linux/hashtable.h>
> > +#include <linux/lsm_hooks.h>
> > +#include <linux/module.h>
> > +#include <linux/ptrace.h>
> > +#include <linux/sched/task_stack.h>
> > +#include <linux/security.h>
> > +
> > +/* Flag indicating whether initialization completed */
> > +int safesetid_initialized;
> > +
> > +#define NUM_BITS 8 /* 128 buckets in hash table */
> > +
> > +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> > +
> > +/*
> > + * Hash table entry to store safesetid policy signifying that 'parent' user
> > + * can setid to 'child' user.
> > + */
> > +struct entry {
> > +     struct hlist_node next;
> > +     struct hlist_node dlist; /* for deletion cleanup */
> > +     uint64_t parent_kuid;
> > +     uint64_t child_kuid;
> > +};
> > +
> > +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> > +
> > +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> > +{
> > +     struct entry *entry;
> > +
> > +     rcu_read_lock();
> > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > +                                entry, next, __kuid_val(parent)) {
> > +             if (entry->parent_kuid == __kuid_val(parent)) {
> > +                     rcu_read_unlock();
> > +                     return true;
> > +             }
> > +     }
> > +     rcu_read_unlock();
> > +
> > +     return false;
> > +}
> > +
> > +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> > +                                                 kuid_t child)
> > +{
> > +     struct entry *entry;
> > +
> > +     rcu_read_lock();
> > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > +                                entry, next, __kuid_val(parent)) {
> > +             if (entry->parent_kuid == __kuid_val(parent) &&
> > +                 entry->child_kuid == __kuid_val(child)) {
> > +                     rcu_read_unlock();
> > +                     return true;
> > +             }
> > +     }
> > +     rcu_read_unlock();
> > +
> > +     return false;
> > +}
> > +
> > +static int safesetid_security_capable(const struct cred *cred,
> > +                                   struct user_namespace *ns,
> > +                                   int cap,
> > +                                   unsigned int opts)
> > +{
> > +     if (cap == CAP_SETUID &&
> > +         check_setuid_policy_hashtable_key(cred->uid)) {
> > +             if (!(opts & CAP_OPT_INSETID)) {
> > +                     /*
> > +                      * Deny if we're not in a set*uid() syscall to avoid
> > +                      * giving powers gated by CAP_SETUID that are related
> > +                      * to functionality other than calling set*uid() (e.g.
> > +                      * allowing user to set up userns uid mappings).
> > +                      */
> > +                     pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> > +                             __kuid_val(cred->uid));
> > +                     return -1;
> > +             }
> > +     }
> > +     return 0;
> > +}
> > +
> > +static int check_uid_transition(kuid_t parent, kuid_t child)
> > +{
> > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > +             return 0;
> > +     pr_warn("UID transition (%d -> %d) blocked",
> > +             __kuid_val(parent),
> > +             __kuid_val(child));
> > +     /*
> > +      * Kill this process to avoid potential security vulnerabilities
> > +      * that could arise from a missing whitelist entry preventing a
> > +      * privileged process from dropping to a lesser-privileged one.
> > +      */
> > +     force_sig(SIGKILL, current);
> > +     return -EACCES;
> > +}
> > +
> > +/*
> > + * Check whether there is either an exception for user under old cred struct to
> > + * set*uid to user under new cred struct, or the UID transition is allowed (by
> > + * Linux set*uid rules) even without CAP_SETUID.
> > + */
> > +static int safesetid_task_fix_setuid(struct cred *new,
> > +                                  const struct cred *old,
> > +                                  int flags)
> > +{
> > +
> > +     /* Do nothing if there are no setuid restrictions for this UID. */
> > +     if (!check_setuid_policy_hashtable_key(old->uid))
> > +             return 0;
> > +
> > +     switch (flags) {
> > +     case LSM_SETID_RE:
> > +             /*
> > +              * Users for which setuid restrictions exist can only set the
> > +              * real UID to the real UID or the effective UID, unless an
> > +              * explicit whitelist policy allows the transition.
> > +              */
> > +             if (!uid_eq(old->uid, new->uid) &&
> > +                     !uid_eq(old->euid, new->uid)) {
> > +                     return check_uid_transition(old->uid, new->uid);
> > +             }
> > +             /*
> > +              * Users for which setuid restrictions exist can only set the
> > +              * effective UID to the real UID, the effective UID, or the
> > +              * saved set-UID, unless an explicit whitelist policy allows
> > +              * the transition.
> > +              */
> > +             if (!uid_eq(old->uid, new->euid) &&
> > +                     !uid_eq(old->euid, new->euid) &&
> > +                     !uid_eq(old->suid, new->euid)) {
> > +                     return check_uid_transition(old->euid, new->euid);
> > +             }
> > +             break;
> > +     case LSM_SETID_ID:
> > +             /*
> > +              * Users for which setuid restrictions exist cannot change the
> > +              * real UID or saved set-UID unless an explicit whitelist
> > +              * policy allows the transition.
> > +              */
> > +             if (!uid_eq(old->uid, new->uid))
> > +                     return check_uid_transition(old->uid, new->uid);
> > +             if (!uid_eq(old->suid, new->suid))
> > +                     return check_uid_transition(old->suid, new->suid);
> > +             break;
> > +     case LSM_SETID_RES:
> > +             /*
> > +              * Users for which setuid restrictions exist cannot change the
> > +              * real UID, effective UID, or saved set-UID to anything but
> > +              * one of: the current real UID, the current effective UID or
> > +              * the current saved set-user-ID unless an explicit whitelist
> > +              * policy allows the transition.
> > +              */
> > +             if (!uid_eq(new->uid, old->uid) &&
> > +                     !uid_eq(new->uid, old->euid) &&
> > +                     !uid_eq(new->uid, old->suid)) {
> > +                     return check_uid_transition(old->uid, new->uid);
> > +             }
> > +             if (!uid_eq(new->euid, old->uid) &&
> > +                     !uid_eq(new->euid, old->euid) &&
> > +                     !uid_eq(new->euid, old->suid)) {
> > +                     return check_uid_transition(old->euid, new->euid);
> > +             }
> > +             if (!uid_eq(new->suid, old->uid) &&
> > +                     !uid_eq(new->suid, old->euid) &&
> > +                     !uid_eq(new->suid, old->suid)) {
> > +                     return check_uid_transition(old->suid, new->suid);
> > +             }
> > +             break;
> > +     case LSM_SETID_FS:
> > +             /*
> > +              * Users for which setuid restrictions exist cannot change the
> > +              * filesystem UID to anything but one of: the current real UID,
> > +              * the current effective UID or the current saved set-UID
> > +              * unless an explicit whitelist policy allows the transition.
> > +              */
> > +             if (!uid_eq(new->fsuid, old->uid)  &&
> > +                     !uid_eq(new->fsuid, old->euid)  &&
> > +                     !uid_eq(new->fsuid, old->suid) &&
> > +                     !uid_eq(new->fsuid, old->fsuid)) {
> > +                     return check_uid_transition(old->fsuid, new->fsuid);
> > +             }
> > +             break;
> > +     default:
> > +             pr_warn("Unknown setid state %d\n", flags);
> > +             force_sig(SIGKILL, current);
> > +             return -EINVAL;
> > +     }
> > +     return 0;
> > +}
> > +
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> > +{
> > +     struct entry *new;
> > +
> > +     /* Return if entry already exists */
> > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > +             return 0;
> > +
> > +     new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> > +     if (!new)
> > +             return -ENOMEM;
> > +     new->parent_kuid = __kuid_val(parent);
> > +     new->child_kuid = __kuid_val(child);
> > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +     hash_add_rcu(safesetid_whitelist_hashtable,
> > +                  &new->next,
> > +                  __kuid_val(parent));
> > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +     return 0;
> > +}
> > +
> > +void flush_safesetid_whitelist_entries(void)
> > +{
> > +     struct entry *entry;
> > +     struct hlist_node *hlist_node;
> > +     unsigned int bkt_loop_cursor;
> > +     HLIST_HEAD(free_list);
> > +
> > +     /*
> > +      * Could probably use hash_for_each_rcu here instead, but this should
> > +      * be fine as well.
> > +      */
> > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > +     hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> > +                        hlist_node, entry, next) {
> > +             hash_del_rcu(&entry->next);
> > +             hlist_add_head(&entry->dlist, &free_list);
> > +     }
> > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > +     synchronize_rcu();
> > +     hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> > +             hlist_del(&entry->dlist);
> > +             kfree(entry);
> > +     }
> > +}
> > +
> > +static struct security_hook_list safesetid_security_hooks[] = {
> > +     LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> > +     LSM_HOOK_INIT(capable, safesetid_security_capable)
> > +};
> > +
> > +static int __init safesetid_security_init(void)
> > +{
> > +     security_add_hooks(safesetid_security_hooks,
> > +                        ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> > +
> > +     /* Report that SafeSetID successfully initialized */
> > +     safesetid_initialized = 1;
> > +
> > +     return 0;
> > +}
> > +
> > +DEFINE_LSM(safesetid_security_init) = {
> > +     .init = safesetid_security_init,
> > +};
> > diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> > new file mode 100644
> > index 000000000000..c1ea3c265fcf
> > --- /dev/null
> > +++ b/security/safesetid/lsm.h
> > @@ -0,0 +1,33 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#ifndef _SAFESETID_H
> > +#define _SAFESETID_H
> > +
> > +#include <linux/types.h>
> > +
> > +/* Flag indicating whether initialization completed */
> > +extern int safesetid_initialized;
> > +
> > +/* Function type. */
> > +enum safesetid_whitelist_file_write_type {
> > +     SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> > +     SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> > +};
> > +
> > +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> > +
> > +void flush_safesetid_whitelist_entries(void);
> > +
> > +#endif /* _SAFESETID_H */
> > diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> > new file mode 100644
> > index 000000000000..61be4ee459cc
> > --- /dev/null
> > +++ b/security/safesetid/securityfs.c
> > @@ -0,0 +1,193 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SafeSetID Linux Security Module
> > + *
> > + * Author: Micah Morton <mortonm@chromium.org>
> > + *
> > + * Copyright (C) 2018 The Chromium OS Authors.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2, as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#include <linux/security.h>
> > +#include <linux/cred.h>
> > +
> > +#include "lsm.h"
> > +
> > +static struct dentry *safesetid_policy_dir;
> > +
> > +struct safesetid_file_entry {
> > +     const char *name;
> > +     enum safesetid_whitelist_file_write_type type;
> > +     struct dentry *dentry;
> > +};
> > +
> > +static struct safesetid_file_entry safesetid_files[] = {
> > +     {.name = "add_whitelist_policy",
> > +      .type = SAFESETID_WHITELIST_ADD},
> > +     {.name = "flush_whitelist_policies",
> > +      .type = SAFESETID_WHITELIST_FLUSH},
> > +};
> > +
> > +/*
> > + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> > + * variables pointed to by 'parent' and 'child' will get updated but this
> > + * function will return an error.
> > + */
> > +static int parse_safesetid_whitelist_policy(const char __user *buf,
> > +                                         size_t len,
> > +                                         kuid_t *parent,
> > +                                         kuid_t *child)
> > +{
> > +     char *kern_buf;
> > +     char *parent_buf;
> > +     char *child_buf;
> > +     const char separator[] = ":";
> > +     int ret;
> > +     size_t first_substring_length;
> > +     long parsed_parent;
> > +     long parsed_child;
> > +
> > +     /* Duplicate string from user memory and NULL-terminate */
> > +     kern_buf = memdup_user_nul(buf, len);
> > +     if (IS_ERR(kern_buf))
> > +             return PTR_ERR(kern_buf);
> > +
> > +     /*
> > +      * Format of |buf| string should be <UID>:<UID>.
> > +      * Find location of ":" in kern_buf (copied from |buf|).
> > +      */
> > +     first_substring_length = strcspn(kern_buf, separator);
> > +     if (first_substring_length == 0 || first_substring_length == len) {
> > +             ret = -EINVAL;
> > +             goto free_kern;
> > +     }
> > +
> > +     parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> > +     if (!parent_buf) {
> > +             ret = -ENOMEM;
> > +             goto free_kern;
> > +     }
> > +
> > +     ret = kstrtol(parent_buf, 0, &parsed_parent);
> > +     if (ret)
> > +             goto free_both;
> > +
> > +     child_buf = kern_buf + first_substring_length + 1;
> > +     ret = kstrtol(child_buf, 0, &parsed_child);
> > +     if (ret)
> > +             goto free_both;
> > +
> > +     *parent = make_kuid(current_user_ns(), parsed_parent);
> > +     if (!uid_valid(*parent)) {
> > +             ret = -EINVAL;
> > +             goto free_both;
> > +     }
> > +
> > +     *child = make_kuid(current_user_ns(), parsed_child);
> > +     if (!uid_valid(*child)) {
> > +             ret = -EINVAL;
> > +             goto free_both;
> > +     }
> > +
> > +free_both:
> > +     kfree(parent_buf);
> > +free_kern:
> > +     kfree(kern_buf);
> > +     return ret;
> > +}
> > +
> > +static ssize_t safesetid_file_write(struct file *file,
> > +                                 const char __user *buf,
> > +                                 size_t len,
> > +                                 loff_t *ppos)
> > +{
> > +     struct safesetid_file_entry *file_entry =
> > +             file->f_inode->i_private;
> > +     kuid_t parent;
> > +     kuid_t child;
> > +     int ret;
> > +
> > +     if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
> > +             return -EPERM;
> > +
> > +     if (*ppos != 0)
> > +             return -EINVAL;
> > +
> > +     switch (file_entry->type) {
> > +     case SAFESETID_WHITELIST_FLUSH:
> > +             flush_safesetid_whitelist_entries();
> > +             break;
> > +     case SAFESETID_WHITELIST_ADD:
> > +             ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> > +                                                              &child);
> > +             if (ret)
> > +                     return ret;
> > +
> > +             ret = add_safesetid_whitelist_entry(parent, child);
> > +             if (ret)
> > +                     return ret;
> > +             break;
> > +     default:
> > +             pr_warn("Unknown securityfs file %d\n", file_entry->type);
> > +             break;
> > +     }
> > +
> > +     /* Return len on success so caller won't keep trying to write */
> > +     return len;
> > +}
> > +
> > +static const struct file_operations safesetid_file_fops = {
> > +     .write = safesetid_file_write,
> > +};
> > +
> > +static void safesetid_shutdown_securityfs(void)
> > +{
> > +     int i;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +             struct safesetid_file_entry *entry =
> > +                     &safesetid_files[i];
> > +             securityfs_remove(entry->dentry);
> > +             entry->dentry = NULL;
> > +     }
> > +
> > +     securityfs_remove(safesetid_policy_dir);
> > +     safesetid_policy_dir = NULL;
> > +}
> > +
> > +static int __init safesetid_init_securityfs(void)
> > +{
> > +     int i;
> > +     int ret;
> > +
> > +     if (!safesetid_initialized)
> > +             return 0;
> > +
> > +     safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> > +     if (!safesetid_policy_dir) {
> > +             ret = PTR_ERR(safesetid_policy_dir);
> > +             goto error;
> > +     }
> > +
> > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > +             struct safesetid_file_entry *entry =
> > +                     &safesetid_files[i];
> > +             entry->dentry = securityfs_create_file(
> > +                     entry->name, 0200, safesetid_policy_dir,
> > +                     entry, &safesetid_file_fops);
> > +             if (IS_ERR(entry->dentry)) {
> > +                     ret = PTR_ERR(entry->dentry);
> > +                     goto error;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +
> > +error:
> > +     safesetid_shutdown_securityfs();
> > +     return ret;
> > +}
> > +fs_initcall(safesetid_init_securityfs);

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-22 20:40                                                 ` Micah Morton
@ 2019-01-22 22:28                                                   ` James Morris
  2019-01-22 22:40                                                     ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: James Morris @ 2019-01-22 22:28 UTC (permalink / raw)
  To: Micah Morton
  Cc: Casey Schaufler, Serge E. Hallyn, Kees Cook, Stephen Smalley,
	linux-security-module

[-- Attachment #1: Type: text/plain, Size: 33668 bytes --]

On Tue, 22 Jan 2019, Micah Morton wrote:

> This has been Acked by Kees and Casey so far. Any further comments on
> this? If not, should be ready to merge?

Did you post a 'v5 1/2' ?

> 
> 
> On Wed, Jan 16, 2019 at 8:10 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >
> > On 1/16/2019 7:46 AM, mortonm@chromium.org wrote:
> > > From: Micah Morton <mortonm@chromium.org>
> > >
> > > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > > transitions from a given UID/GID to only those approved by a
> > > system-wide whitelist. These restrictions also prohibit the given
> > > UIDs/GIDs from obtaining auxiliary privileges associated with
> > > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > > mappings. For now, only gating the set*uid family of syscalls is
> > > supported, with support for set*gid coming in a future patch set.
> > >
> > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > Acked-by: Kees Cook <keescook@chromium.org>
> >
> > While I have some lesser reservations philosophically, all
> > direct technical objections have been addressed.
> >
> > Acked-by: Casey Schaufler <casey@schaufler-ca.com>
> >
> > > ---
> > > Changes since last patch:
> > >   - added 'safesetid' to the ordered list of enabled LSMs in
> > >     security/Kconfig.
> > >   - added a "did I get initialized?" variable for the securityfs init to
> > >     check and check that variable in securityfs.c to skip tree creation
> > >     if safesetid isn't running
> > >  Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
> > >  Documentation/admin-guide/LSM/index.rst     |   1 +
> > >  security/Kconfig                            |   3 +-
> > >  security/Makefile                           |   2 +
> > >  security/safesetid/Kconfig                  |  12 +
> > >  security/safesetid/Makefile                 |   7 +
> > >  security/safesetid/lsm.c                    | 277 ++++++++++++++++++++
> > >  security/safesetid/lsm.h                    |  33 +++
> > >  security/safesetid/securityfs.c             | 193 ++++++++++++++
> > >  9 files changed, 634 insertions(+), 1 deletion(-)
> > >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> > >  create mode 100644 security/safesetid/Kconfig
> > >  create mode 100644 security/safesetid/Makefile
> > >  create mode 100644 security/safesetid/lsm.c
> > >  create mode 100644 security/safesetid/lsm.h
> > >  create mode 100644 security/safesetid/securityfs.c
> > >
> > > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > new file mode 100644
> > > index 000000000000..ffb64be67f7a
> > > --- /dev/null
> > > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > @@ -0,0 +1,107 @@
> > > +=========
> > > +SafeSetID
> > > +=========
> > > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > > +UID/GID transitions from a given UID/GID to only those approved by a
> > > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > > +allowing a user to set up user namespace UID mappings.
> > > +
> > > +
> > > +Background
> > > +==========
> > > +In absence of file capabilities, processes spawned on a Linux system that need
> > > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > > +often preferable to use Linux runtime capabilities rather than file
> > > +capabilities, since using file capabilities to run a program with elevated
> > > +privileges opens up possible security holes since any user with access to the
> > > +file can exec() that program to gain the elevated privileges.
> > > +
> > > +While it is possible to implement a tree of processes by giving full
> > > +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> > > +tree of processes under non-root user(s) in the first place. Specifically,
> > > +since CAP_SETUID allows changing to any user on the system, including the root
> > > +user, it is an overpowered capability for what is needed in this scenario,
> > > +especially since programs often only call setuid() to drop privileges to a
> > > +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> > > +generally feasible way in Linux to restrict the potential UIDs that a user can
> > > +switch to through setuid() beyond allowing a switch to any user on the system.
> > > +This SafeSetID LSM seeks to provide a solution for restricting setid
> > > +capabilities in such a way.
> > > +
> > > +The main use case for this LSM is to allow a non-root program to transition to
> > > +other untrusted uids without full blown CAP_SETUID capabilities. The non-root
> > > +program would still need CAP_SETUID to do any kind of transition, but the
> > > +additional restrictions imposed by this LSM would mean it is a "safer" version
> > > +of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
> > > +do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
> > > +namespace). The higher level goal is to allow for uid-based sandboxing of system
> > > +services without having to give out CAP_SETUID all over the place just so that
> > > +non-root programs can drop to even-lesser-privileged uids. This is especially
> > > +relevant when one non-root daemon on the system should be allowed to spawn other
> > > +processes as different uids, but its undesirable to give the daemon a
> > > +basically-root-equivalent CAP_SETUID.
> > > +
> > > +
> > > +Other Approaches Considered
> > > +===========================
> > > +
> > > +Solve this problem in userspace
> > > +-------------------------------
> > > +For candidate applications that would like to have restricted setid capabilities
> > > +as implemented in this LSM, an alternative option would be to simply take away
> > > +setid capabilities from the application completely and refactor the process
> > > +spawning semantics in the application (e.g. by using a privileged helper program
> > > +to do process spawning and UID/GID transitions). Unfortunately, there are a
> > > +number of semantics around process spawning that would be affected by this, such
> > > +as fork() calls where the program doesn’t immediately call exec() after the
> > > +fork(), parent processes specifying custom environment variables or command line
> > > +args for spawned child processes, or inheritance of file handles across a
> > > +fork()/exec(). Because of this, as solution that uses a privileged helper in
> > > +userspace would likely be less appealing to incorporate into existing projects
> > > +that rely on certain process-spawning semantics in Linux.
> > > +
> > > +Use user namespaces
> > > +-------------------
> > > +Another possible approach would be to run a given process tree in its own user
> > > +namespace and give programs in the tree setid capabilities. In this way,
> > > +programs in the tree could change to any desired UID/GID in the context of their
> > > +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> > > +initial system user namespace, affectively preventing privilege escalation.
> > > +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> > > +without pairing them with other namespace types, which is not always an option.
> > > +Linux checks for capabilities based off of the user namespace that “owns” some
> > > +entity. For example, Linux has the notion that network namespaces are owned by
> > > +the user namespace in which they were created. A consequence of this is that
> > > +capability checks for access to a given network namespace are done by checking
> > > +whether a task has the given capability in the context of the user namespace
> > > +that owns the network namespace -- not necessarily the user namespace under
> > > +which the given task runs. Therefore spawning a process in a new user namespace
> > > +effectively prevents it from accessing the network namespace owned by the
> > > +initial namespace. This is a deal-breaker for any application that expects to
> > > +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> > > +configurations. Using user namespaces in isolation causes problems regarding
> > > +other system interactions, including use of pid namespaces and device creation.
> > > +
> > > +Use an existing LSM
> > > +-------------------
> > > +None of the other in-tree LSMs have the capability to gate setid transitions, or
> > > +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> > > +"Since setuid only affects the current process, and since the SELinux controls
> > > +are not based on the Linux identity attributes, SELinux does not need to control
> > > +this operation."
> > > +
> > > +
> > > +Directions for use
> > > +==================
> > > +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> > > +applicable restriction policy is in place. Policies are configured through
> > > +securityfs by writing to the safesetid/add_whitelist_policy and
> > > +safesetid/flush_whitelist_policies files at the location where securityfs is
> > > +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> > > +numbers, such as '123:456'. To flush the policies, any write to the file is
> > > +sufficient. Again, configuring a policy for a UID will prevent that UID from
> > > +obtaining auxiliary setid privileges, such as allowing a user to set up user
> > > +namespace UID mappings.
> > > diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> > > index 9842e21afd4a..a6ba95fbaa9f 100644
> > > --- a/Documentation/admin-guide/LSM/index.rst
> > > +++ b/Documentation/admin-guide/LSM/index.rst
> > > @@ -46,3 +46,4 @@ subdirectories.
> > >     Smack
> > >     tomoyo
> > >     Yama
> > > +   SafeSetID
> > > diff --git a/security/Kconfig b/security/Kconfig
> > > index 78dc12b7eeb3..9555f4914492 100644
> > > --- a/security/Kconfig
> > > +++ b/security/Kconfig
> > > @@ -236,12 +236,13 @@ source "security/tomoyo/Kconfig"
> > >  source "security/apparmor/Kconfig"
> > >  source "security/loadpin/Kconfig"
> > >  source "security/yama/Kconfig"
> > > +source "security/safesetid/Kconfig"
> > >
> > >  source "security/integrity/Kconfig"
> > >
> > >  config LSM
> > >       string "Ordered list of enabled LSMs"
> > > -     default "yama,loadpin,integrity,selinux,smack,tomoyo,apparmor"
> > > +     default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
> > >       help
> > >         A comma-separated list of LSMs, in initialization order.
> > >         Any LSMs left off this list will be ignored. This can be
> > > diff --git a/security/Makefile b/security/Makefile
> > > index 4d2d3782ddef..c598b904938f 100644
> > > --- a/security/Makefile
> > > +++ b/security/Makefile
> > > @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
> > >  subdir-$(CONFIG_SECURITY_APPARMOR)   += apparmor
> > >  subdir-$(CONFIG_SECURITY_YAMA)               += yama
> > >  subdir-$(CONFIG_SECURITY_LOADPIN)    += loadpin
> > > +subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
> > >
> > >  # always enable default capabilities
> > >  obj-y                                        += commoncap.o
> > > @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)               += tomoyo/
> > >  obj-$(CONFIG_SECURITY_APPARMOR)              += apparmor/
> > >  obj-$(CONFIG_SECURITY_YAMA)          += yama/
> > >  obj-$(CONFIG_SECURITY_LOADPIN)               += loadpin/
> > > +obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
> > >  obj-$(CONFIG_CGROUP_DEVICE)          += device_cgroup.o
> > >
> > >  # Object integrity file lists
> > > diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> > > new file mode 100644
> > > index 000000000000..bf89a47ffcc8
> > > --- /dev/null
> > > +++ b/security/safesetid/Kconfig
> > > @@ -0,0 +1,12 @@
> > > +config SECURITY_SAFESETID
> > > +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> > > +        default n
> > > +        help
> > > +          SafeSetID is an LSM module that gates the setid family of syscalls to
> > > +          restrict UID/GID transitions from a given UID/GID to only those
> > > +          approved by a system-wide whitelist. These restrictions also prohibit
> > > +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> > > +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> > > +          UID mappings.
> > > +
> > > +          If you are unsure how to answer this question, answer N.
> > > diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> > > new file mode 100644
> > > index 000000000000..6b0660321164
> > > --- /dev/null
> > > +++ b/security/safesetid/Makefile
> > > @@ -0,0 +1,7 @@
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +#
> > > +# Makefile for the safesetid LSM.
> > > +#
> > > +
> > > +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> > > +safesetid-y := lsm.o securityfs.o
> > > diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> > > new file mode 100644
> > > index 000000000000..3a2c75ac810c
> > > --- /dev/null
> > > +++ b/security/safesetid/lsm.c
> > > @@ -0,0 +1,277 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * SafeSetID Linux Security Module
> > > + *
> > > + * Author: Micah Morton <mortonm@chromium.org>
> > > + *
> > > + * Copyright (C) 2018 The Chromium OS Authors.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2, as
> > > + * published by the Free Software Foundation.
> > > + *
> > > + */
> > > +
> > > +#define pr_fmt(fmt) "SafeSetID: " fmt
> > > +
> > > +#include <asm/syscall.h>
> > > +#include <linux/hashtable.h>
> > > +#include <linux/lsm_hooks.h>
> > > +#include <linux/module.h>
> > > +#include <linux/ptrace.h>
> > > +#include <linux/sched/task_stack.h>
> > > +#include <linux/security.h>
> > > +
> > > +/* Flag indicating whether initialization completed */
> > > +int safesetid_initialized;
> > > +
> > > +#define NUM_BITS 8 /* 128 buckets in hash table */
> > > +
> > > +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> > > +
> > > +/*
> > > + * Hash table entry to store safesetid policy signifying that 'parent' user
> > > + * can setid to 'child' user.
> > > + */
> > > +struct entry {
> > > +     struct hlist_node next;
> > > +     struct hlist_node dlist; /* for deletion cleanup */
> > > +     uint64_t parent_kuid;
> > > +     uint64_t child_kuid;
> > > +};
> > > +
> > > +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> > > +
> > > +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> > > +{
> > > +     struct entry *entry;
> > > +
> > > +     rcu_read_lock();
> > > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > > +                                entry, next, __kuid_val(parent)) {
> > > +             if (entry->parent_kuid == __kuid_val(parent)) {
> > > +                     rcu_read_unlock();
> > > +                     return true;
> > > +             }
> > > +     }
> > > +     rcu_read_unlock();
> > > +
> > > +     return false;
> > > +}
> > > +
> > > +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> > > +                                                 kuid_t child)
> > > +{
> > > +     struct entry *entry;
> > > +
> > > +     rcu_read_lock();
> > > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > > +                                entry, next, __kuid_val(parent)) {
> > > +             if (entry->parent_kuid == __kuid_val(parent) &&
> > > +                 entry->child_kuid == __kuid_val(child)) {
> > > +                     rcu_read_unlock();
> > > +                     return true;
> > > +             }
> > > +     }
> > > +     rcu_read_unlock();
> > > +
> > > +     return false;
> > > +}
> > > +
> > > +static int safesetid_security_capable(const struct cred *cred,
> > > +                                   struct user_namespace *ns,
> > > +                                   int cap,
> > > +                                   unsigned int opts)
> > > +{
> > > +     if (cap == CAP_SETUID &&
> > > +         check_setuid_policy_hashtable_key(cred->uid)) {
> > > +             if (!(opts & CAP_OPT_INSETID)) {
> > > +                     /*
> > > +                      * Deny if we're not in a set*uid() syscall to avoid
> > > +                      * giving powers gated by CAP_SETUID that are related
> > > +                      * to functionality other than calling set*uid() (e.g.
> > > +                      * allowing user to set up userns uid mappings).
> > > +                      */
> > > +                     pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> > > +                             __kuid_val(cred->uid));
> > > +                     return -1;
> > > +             }
> > > +     }
> > > +     return 0;
> > > +}
> > > +
> > > +static int check_uid_transition(kuid_t parent, kuid_t child)
> > > +{
> > > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > > +             return 0;
> > > +     pr_warn("UID transition (%d -> %d) blocked",
> > > +             __kuid_val(parent),
> > > +             __kuid_val(child));
> > > +     /*
> > > +      * Kill this process to avoid potential security vulnerabilities
> > > +      * that could arise from a missing whitelist entry preventing a
> > > +      * privileged process from dropping to a lesser-privileged one.
> > > +      */
> > > +     force_sig(SIGKILL, current);
> > > +     return -EACCES;
> > > +}
> > > +
> > > +/*
> > > + * Check whether there is either an exception for user under old cred struct to
> > > + * set*uid to user under new cred struct, or the UID transition is allowed (by
> > > + * Linux set*uid rules) even without CAP_SETUID.
> > > + */
> > > +static int safesetid_task_fix_setuid(struct cred *new,
> > > +                                  const struct cred *old,
> > > +                                  int flags)
> > > +{
> > > +
> > > +     /* Do nothing if there are no setuid restrictions for this UID. */
> > > +     if (!check_setuid_policy_hashtable_key(old->uid))
> > > +             return 0;
> > > +
> > > +     switch (flags) {
> > > +     case LSM_SETID_RE:
> > > +             /*
> > > +              * Users for which setuid restrictions exist can only set the
> > > +              * real UID to the real UID or the effective UID, unless an
> > > +              * explicit whitelist policy allows the transition.
> > > +              */
> > > +             if (!uid_eq(old->uid, new->uid) &&
> > > +                     !uid_eq(old->euid, new->uid)) {
> > > +                     return check_uid_transition(old->uid, new->uid);
> > > +             }
> > > +             /*
> > > +              * Users for which setuid restrictions exist can only set the
> > > +              * effective UID to the real UID, the effective UID, or the
> > > +              * saved set-UID, unless an explicit whitelist policy allows
> > > +              * the transition.
> > > +              */
> > > +             if (!uid_eq(old->uid, new->euid) &&
> > > +                     !uid_eq(old->euid, new->euid) &&
> > > +                     !uid_eq(old->suid, new->euid)) {
> > > +                     return check_uid_transition(old->euid, new->euid);
> > > +             }
> > > +             break;
> > > +     case LSM_SETID_ID:
> > > +             /*
> > > +              * Users for which setuid restrictions exist cannot change the
> > > +              * real UID or saved set-UID unless an explicit whitelist
> > > +              * policy allows the transition.
> > > +              */
> > > +             if (!uid_eq(old->uid, new->uid))
> > > +                     return check_uid_transition(old->uid, new->uid);
> > > +             if (!uid_eq(old->suid, new->suid))
> > > +                     return check_uid_transition(old->suid, new->suid);
> > > +             break;
> > > +     case LSM_SETID_RES:
> > > +             /*
> > > +              * Users for which setuid restrictions exist cannot change the
> > > +              * real UID, effective UID, or saved set-UID to anything but
> > > +              * one of: the current real UID, the current effective UID or
> > > +              * the current saved set-user-ID unless an explicit whitelist
> > > +              * policy allows the transition.
> > > +              */
> > > +             if (!uid_eq(new->uid, old->uid) &&
> > > +                     !uid_eq(new->uid, old->euid) &&
> > > +                     !uid_eq(new->uid, old->suid)) {
> > > +                     return check_uid_transition(old->uid, new->uid);
> > > +             }
> > > +             if (!uid_eq(new->euid, old->uid) &&
> > > +                     !uid_eq(new->euid, old->euid) &&
> > > +                     !uid_eq(new->euid, old->suid)) {
> > > +                     return check_uid_transition(old->euid, new->euid);
> > > +             }
> > > +             if (!uid_eq(new->suid, old->uid) &&
> > > +                     !uid_eq(new->suid, old->euid) &&
> > > +                     !uid_eq(new->suid, old->suid)) {
> > > +                     return check_uid_transition(old->suid, new->suid);
> > > +             }
> > > +             break;
> > > +     case LSM_SETID_FS:
> > > +             /*
> > > +              * Users for which setuid restrictions exist cannot change the
> > > +              * filesystem UID to anything but one of: the current real UID,
> > > +              * the current effective UID or the current saved set-UID
> > > +              * unless an explicit whitelist policy allows the transition.
> > > +              */
> > > +             if (!uid_eq(new->fsuid, old->uid)  &&
> > > +                     !uid_eq(new->fsuid, old->euid)  &&
> > > +                     !uid_eq(new->fsuid, old->suid) &&
> > > +                     !uid_eq(new->fsuid, old->fsuid)) {
> > > +                     return check_uid_transition(old->fsuid, new->fsuid);
> > > +             }
> > > +             break;
> > > +     default:
> > > +             pr_warn("Unknown setid state %d\n", flags);
> > > +             force_sig(SIGKILL, current);
> > > +             return -EINVAL;
> > > +     }
> > > +     return 0;
> > > +}
> > > +
> > > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> > > +{
> > > +     struct entry *new;
> > > +
> > > +     /* Return if entry already exists */
> > > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > > +             return 0;
> > > +
> > > +     new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> > > +     if (!new)
> > > +             return -ENOMEM;
> > > +     new->parent_kuid = __kuid_val(parent);
> > > +     new->child_kuid = __kuid_val(child);
> > > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > > +     hash_add_rcu(safesetid_whitelist_hashtable,
> > > +                  &new->next,
> > > +                  __kuid_val(parent));
> > > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > > +     return 0;
> > > +}
> > > +
> > > +void flush_safesetid_whitelist_entries(void)
> > > +{
> > > +     struct entry *entry;
> > > +     struct hlist_node *hlist_node;
> > > +     unsigned int bkt_loop_cursor;
> > > +     HLIST_HEAD(free_list);
> > > +
> > > +     /*
> > > +      * Could probably use hash_for_each_rcu here instead, but this should
> > > +      * be fine as well.
> > > +      */
> > > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > > +     hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> > > +                        hlist_node, entry, next) {
> > > +             hash_del_rcu(&entry->next);
> > > +             hlist_add_head(&entry->dlist, &free_list);
> > > +     }
> > > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > > +     synchronize_rcu();
> > > +     hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> > > +             hlist_del(&entry->dlist);
> > > +             kfree(entry);
> > > +     }
> > > +}
> > > +
> > > +static struct security_hook_list safesetid_security_hooks[] = {
> > > +     LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> > > +     LSM_HOOK_INIT(capable, safesetid_security_capable)
> > > +};
> > > +
> > > +static int __init safesetid_security_init(void)
> > > +{
> > > +     security_add_hooks(safesetid_security_hooks,
> > > +                        ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> > > +
> > > +     /* Report that SafeSetID successfully initialized */
> > > +     safesetid_initialized = 1;
> > > +
> > > +     return 0;
> > > +}
> > > +
> > > +DEFINE_LSM(safesetid_security_init) = {
> > > +     .init = safesetid_security_init,
> > > +};
> > > diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> > > new file mode 100644
> > > index 000000000000..c1ea3c265fcf
> > > --- /dev/null
> > > +++ b/security/safesetid/lsm.h
> > > @@ -0,0 +1,33 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +/*
> > > + * SafeSetID Linux Security Module
> > > + *
> > > + * Author: Micah Morton <mortonm@chromium.org>
> > > + *
> > > + * Copyright (C) 2018 The Chromium OS Authors.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2, as
> > > + * published by the Free Software Foundation.
> > > + *
> > > + */
> > > +#ifndef _SAFESETID_H
> > > +#define _SAFESETID_H
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +/* Flag indicating whether initialization completed */
> > > +extern int safesetid_initialized;
> > > +
> > > +/* Function type. */
> > > +enum safesetid_whitelist_file_write_type {
> > > +     SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> > > +     SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> > > +};
> > > +
> > > +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> > > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> > > +
> > > +void flush_safesetid_whitelist_entries(void);
> > > +
> > > +#endif /* _SAFESETID_H */
> > > diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> > > new file mode 100644
> > > index 000000000000..61be4ee459cc
> > > --- /dev/null
> > > +++ b/security/safesetid/securityfs.c
> > > @@ -0,0 +1,193 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * SafeSetID Linux Security Module
> > > + *
> > > + * Author: Micah Morton <mortonm@chromium.org>
> > > + *
> > > + * Copyright (C) 2018 The Chromium OS Authors.
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2, as
> > > + * published by the Free Software Foundation.
> > > + *
> > > + */
> > > +#include <linux/security.h>
> > > +#include <linux/cred.h>
> > > +
> > > +#include "lsm.h"
> > > +
> > > +static struct dentry *safesetid_policy_dir;
> > > +
> > > +struct safesetid_file_entry {
> > > +     const char *name;
> > > +     enum safesetid_whitelist_file_write_type type;
> > > +     struct dentry *dentry;
> > > +};
> > > +
> > > +static struct safesetid_file_entry safesetid_files[] = {
> > > +     {.name = "add_whitelist_policy",
> > > +      .type = SAFESETID_WHITELIST_ADD},
> > > +     {.name = "flush_whitelist_policies",
> > > +      .type = SAFESETID_WHITELIST_FLUSH},
> > > +};
> > > +
> > > +/*
> > > + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> > > + * variables pointed to by 'parent' and 'child' will get updated but this
> > > + * function will return an error.
> > > + */
> > > +static int parse_safesetid_whitelist_policy(const char __user *buf,
> > > +                                         size_t len,
> > > +                                         kuid_t *parent,
> > > +                                         kuid_t *child)
> > > +{
> > > +     char *kern_buf;
> > > +     char *parent_buf;
> > > +     char *child_buf;
> > > +     const char separator[] = ":";
> > > +     int ret;
> > > +     size_t first_substring_length;
> > > +     long parsed_parent;
> > > +     long parsed_child;
> > > +
> > > +     /* Duplicate string from user memory and NULL-terminate */
> > > +     kern_buf = memdup_user_nul(buf, len);
> > > +     if (IS_ERR(kern_buf))
> > > +             return PTR_ERR(kern_buf);
> > > +
> > > +     /*
> > > +      * Format of |buf| string should be <UID>:<UID>.
> > > +      * Find location of ":" in kern_buf (copied from |buf|).
> > > +      */
> > > +     first_substring_length = strcspn(kern_buf, separator);
> > > +     if (first_substring_length == 0 || first_substring_length == len) {
> > > +             ret = -EINVAL;
> > > +             goto free_kern;
> > > +     }
> > > +
> > > +     parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> > > +     if (!parent_buf) {
> > > +             ret = -ENOMEM;
> > > +             goto free_kern;
> > > +     }
> > > +
> > > +     ret = kstrtol(parent_buf, 0, &parsed_parent);
> > > +     if (ret)
> > > +             goto free_both;
> > > +
> > > +     child_buf = kern_buf + first_substring_length + 1;
> > > +     ret = kstrtol(child_buf, 0, &parsed_child);
> > > +     if (ret)
> > > +             goto free_both;
> > > +
> > > +     *parent = make_kuid(current_user_ns(), parsed_parent);
> > > +     if (!uid_valid(*parent)) {
> > > +             ret = -EINVAL;
> > > +             goto free_both;
> > > +     }
> > > +
> > > +     *child = make_kuid(current_user_ns(), parsed_child);
> > > +     if (!uid_valid(*child)) {
> > > +             ret = -EINVAL;
> > > +             goto free_both;
> > > +     }
> > > +
> > > +free_both:
> > > +     kfree(parent_buf);
> > > +free_kern:
> > > +     kfree(kern_buf);
> > > +     return ret;
> > > +}
> > > +
> > > +static ssize_t safesetid_file_write(struct file *file,
> > > +                                 const char __user *buf,
> > > +                                 size_t len,
> > > +                                 loff_t *ppos)
> > > +{
> > > +     struct safesetid_file_entry *file_entry =
> > > +             file->f_inode->i_private;
> > > +     kuid_t parent;
> > > +     kuid_t child;
> > > +     int ret;
> > > +
> > > +     if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
> > > +             return -EPERM;
> > > +
> > > +     if (*ppos != 0)
> > > +             return -EINVAL;
> > > +
> > > +     switch (file_entry->type) {
> > > +     case SAFESETID_WHITELIST_FLUSH:
> > > +             flush_safesetid_whitelist_entries();
> > > +             break;
> > > +     case SAFESETID_WHITELIST_ADD:
> > > +             ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> > > +                                                              &child);
> > > +             if (ret)
> > > +                     return ret;
> > > +
> > > +             ret = add_safesetid_whitelist_entry(parent, child);
> > > +             if (ret)
> > > +                     return ret;
> > > +             break;
> > > +     default:
> > > +             pr_warn("Unknown securityfs file %d\n", file_entry->type);
> > > +             break;
> > > +     }
> > > +
> > > +     /* Return len on success so caller won't keep trying to write */
> > > +     return len;
> > > +}
> > > +
> > > +static const struct file_operations safesetid_file_fops = {
> > > +     .write = safesetid_file_write,
> > > +};
> > > +
> > > +static void safesetid_shutdown_securityfs(void)
> > > +{
> > > +     int i;
> > > +
> > > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > > +             struct safesetid_file_entry *entry =
> > > +                     &safesetid_files[i];
> > > +             securityfs_remove(entry->dentry);
> > > +             entry->dentry = NULL;
> > > +     }
> > > +
> > > +     securityfs_remove(safesetid_policy_dir);
> > > +     safesetid_policy_dir = NULL;
> > > +}
> > > +
> > > +static int __init safesetid_init_securityfs(void)
> > > +{
> > > +     int i;
> > > +     int ret;
> > > +
> > > +     if (!safesetid_initialized)
> > > +             return 0;
> > > +
> > > +     safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> > > +     if (!safesetid_policy_dir) {
> > > +             ret = PTR_ERR(safesetid_policy_dir);
> > > +             goto error;
> > > +     }
> > > +
> > > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > > +             struct safesetid_file_entry *entry =
> > > +                     &safesetid_files[i];
> > > +             entry->dentry = securityfs_create_file(
> > > +                     entry->name, 0200, safesetid_policy_dir,
> > > +                     entry, &safesetid_file_fops);
> > > +             if (IS_ERR(entry->dentry)) {
> > > +                     ret = PTR_ERR(entry->dentry);
> > > +                     goto error;
> > > +             }
> > > +     }
> > > +
> > > +     return 0;
> > > +
> > > +error:
> > > +     safesetid_shutdown_securityfs();
> > > +     return ret;
> > > +}
> > > +fs_initcall(safesetid_init_securityfs);
> 

-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-22 22:28                                                   ` James Morris
@ 2019-01-22 22:40                                                     ` Micah Morton
  2019-01-22 22:42                                                       ` [PATCH v3 1/2] " mortonm
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-22 22:40 UTC (permalink / raw)
  To: James Morris
  Cc: Casey Schaufler, Serge E. Hallyn, Kees Cook, Stephen Smalley,
	linux-security-module

Sorry, I mistakenly named the 1/2 patch different from the 2/2 patch.
The 1/2 patch that goes along with this is "[PATCH v3 1/2] LSM: mark
all set*uid call sites in kernel/sys.c". I'll resend that patch as a
follow up message on here in case that is easier.

On Tue, Jan 22, 2019 at 2:28 PM James Morris <jmorris@namei.org> wrote:
>
> On Tue, 22 Jan 2019, Micah Morton wrote:
>
> > This has been Acked by Kees and Casey so far. Any further comments on
> > this? If not, should be ready to merge?
>
> Did you post a 'v5 1/2' ?
>
> >
> >
> > On Wed, Jan 16, 2019 at 8:10 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > >
> > > On 1/16/2019 7:46 AM, mortonm@chromium.org wrote:
> > > > From: Micah Morton <mortonm@chromium.org>
> > > >
> > > > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > > > transitions from a given UID/GID to only those approved by a
> > > > system-wide whitelist. These restrictions also prohibit the given
> > > > UIDs/GIDs from obtaining auxiliary privileges associated with
> > > > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > > > mappings. For now, only gating the set*uid family of syscalls is
> > > > supported, with support for set*gid coming in a future patch set.
> > > >
> > > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > > Acked-by: Kees Cook <keescook@chromium.org>
> > >
> > > While I have some lesser reservations philosophically, all
> > > direct technical objections have been addressed.
> > >
> > > Acked-by: Casey Schaufler <casey@schaufler-ca.com>
> > >
> > > > ---
> > > > Changes since last patch:
> > > >   - added 'safesetid' to the ordered list of enabled LSMs in
> > > >     security/Kconfig.
> > > >   - added a "did I get initialized?" variable for the securityfs init to
> > > >     check and check that variable in securityfs.c to skip tree creation
> > > >     if safesetid isn't running
> > > >  Documentation/admin-guide/LSM/SafeSetID.rst | 107 ++++++++
> > > >  Documentation/admin-guide/LSM/index.rst     |   1 +
> > > >  security/Kconfig                            |   3 +-
> > > >  security/Makefile                           |   2 +
> > > >  security/safesetid/Kconfig                  |  12 +
> > > >  security/safesetid/Makefile                 |   7 +
> > > >  security/safesetid/lsm.c                    | 277 ++++++++++++++++++++
> > > >  security/safesetid/lsm.h                    |  33 +++
> > > >  security/safesetid/securityfs.c             | 193 ++++++++++++++
> > > >  9 files changed, 634 insertions(+), 1 deletion(-)
> > > >  create mode 100644 Documentation/admin-guide/LSM/SafeSetID.rst
> > > >  create mode 100644 security/safesetid/Kconfig
> > > >  create mode 100644 security/safesetid/Makefile
> > > >  create mode 100644 security/safesetid/lsm.c
> > > >  create mode 100644 security/safesetid/lsm.h
> > > >  create mode 100644 security/safesetid/securityfs.c
> > > >
> > > > diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > > new file mode 100644
> > > > index 000000000000..ffb64be67f7a
> > > > --- /dev/null
> > > > +++ b/Documentation/admin-guide/LSM/SafeSetID.rst
> > > > @@ -0,0 +1,107 @@
> > > > +=========
> > > > +SafeSetID
> > > > +=========
> > > > +SafeSetID is an LSM module that gates the setid family of syscalls to restrict
> > > > +UID/GID transitions from a given UID/GID to only those approved by a
> > > > +system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
> > > > +from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
> > > > +allowing a user to set up user namespace UID mappings.
> > > > +
> > > > +
> > > > +Background
> > > > +==========
> > > > +In absence of file capabilities, processes spawned on a Linux system that need
> > > > +to switch to a different user must be spawned with CAP_SETUID privileges.
> > > > +CAP_SETUID is granted to programs running as root or those running as a non-root
> > > > +user that have been explicitly given the CAP_SETUID runtime capability. It is
> > > > +often preferable to use Linux runtime capabilities rather than file
> > > > +capabilities, since using file capabilities to run a program with elevated
> > > > +privileges opens up possible security holes since any user with access to the
> > > > +file can exec() that program to gain the elevated privileges.
> > > > +
> > > > +While it is possible to implement a tree of processes by giving full
> > > > +CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
> > > > +tree of processes under non-root user(s) in the first place. Specifically,
> > > > +since CAP_SETUID allows changing to any user on the system, including the root
> > > > +user, it is an overpowered capability for what is needed in this scenario,
> > > > +especially since programs often only call setuid() to drop privileges to a
> > > > +lesser-privileged user -- not elevate privileges. Unfortunately, there is no
> > > > +generally feasible way in Linux to restrict the potential UIDs that a user can
> > > > +switch to through setuid() beyond allowing a switch to any user on the system.
> > > > +This SafeSetID LSM seeks to provide a solution for restricting setid
> > > > +capabilities in such a way.
> > > > +
> > > > +The main use case for this LSM is to allow a non-root program to transition to
> > > > +other untrusted uids without full blown CAP_SETUID capabilities. The non-root
> > > > +program would still need CAP_SETUID to do any kind of transition, but the
> > > > +additional restrictions imposed by this LSM would mean it is a "safer" version
> > > > +of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
> > > > +do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
> > > > +namespace). The higher level goal is to allow for uid-based sandboxing of system
> > > > +services without having to give out CAP_SETUID all over the place just so that
> > > > +non-root programs can drop to even-lesser-privileged uids. This is especially
> > > > +relevant when one non-root daemon on the system should be allowed to spawn other
> > > > +processes as different uids, but its undesirable to give the daemon a
> > > > +basically-root-equivalent CAP_SETUID.
> > > > +
> > > > +
> > > > +Other Approaches Considered
> > > > +===========================
> > > > +
> > > > +Solve this problem in userspace
> > > > +-------------------------------
> > > > +For candidate applications that would like to have restricted setid capabilities
> > > > +as implemented in this LSM, an alternative option would be to simply take away
> > > > +setid capabilities from the application completely and refactor the process
> > > > +spawning semantics in the application (e.g. by using a privileged helper program
> > > > +to do process spawning and UID/GID transitions). Unfortunately, there are a
> > > > +number of semantics around process spawning that would be affected by this, such
> > > > +as fork() calls where the program doesn’t immediately call exec() after the
> > > > +fork(), parent processes specifying custom environment variables or command line
> > > > +args for spawned child processes, or inheritance of file handles across a
> > > > +fork()/exec(). Because of this, as solution that uses a privileged helper in
> > > > +userspace would likely be less appealing to incorporate into existing projects
> > > > +that rely on certain process-spawning semantics in Linux.
> > > > +
> > > > +Use user namespaces
> > > > +-------------------
> > > > +Another possible approach would be to run a given process tree in its own user
> > > > +namespace and give programs in the tree setid capabilities. In this way,
> > > > +programs in the tree could change to any desired UID/GID in the context of their
> > > > +own user namespace, and only approved UIDs/GIDs could be mapped back to the
> > > > +initial system user namespace, affectively preventing privilege escalation.
> > > > +Unfortunately, it is not generally feasible to use user namespaces in isolation,
> > > > +without pairing them with other namespace types, which is not always an option.
> > > > +Linux checks for capabilities based off of the user namespace that “owns” some
> > > > +entity. For example, Linux has the notion that network namespaces are owned by
> > > > +the user namespace in which they were created. A consequence of this is that
> > > > +capability checks for access to a given network namespace are done by checking
> > > > +whether a task has the given capability in the context of the user namespace
> > > > +that owns the network namespace -- not necessarily the user namespace under
> > > > +which the given task runs. Therefore spawning a process in a new user namespace
> > > > +effectively prevents it from accessing the network namespace owned by the
> > > > +initial namespace. This is a deal-breaker for any application that expects to
> > > > +retain the CAP_NET_ADMIN capability for the purpose of adjusting network
> > > > +configurations. Using user namespaces in isolation causes problems regarding
> > > > +other system interactions, including use of pid namespaces and device creation.
> > > > +
> > > > +Use an existing LSM
> > > > +-------------------
> > > > +None of the other in-tree LSMs have the capability to gate setid transitions, or
> > > > +even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
> > > > +"Since setuid only affects the current process, and since the SELinux controls
> > > > +are not based on the Linux identity attributes, SELinux does not need to control
> > > > +this operation."
> > > > +
> > > > +
> > > > +Directions for use
> > > > +==================
> > > > +This LSM hooks the setid syscalls to make sure transitions are allowed if an
> > > > +applicable restriction policy is in place. Policies are configured through
> > > > +securityfs by writing to the safesetid/add_whitelist_policy and
> > > > +safesetid/flush_whitelist_policies files at the location where securityfs is
> > > > +mounted. The format for adding a policy is '<UID>:<UID>', using literal
> > > > +numbers, such as '123:456'. To flush the policies, any write to the file is
> > > > +sufficient. Again, configuring a policy for a UID will prevent that UID from
> > > > +obtaining auxiliary setid privileges, such as allowing a user to set up user
> > > > +namespace UID mappings.
> > > > diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst
> > > > index 9842e21afd4a..a6ba95fbaa9f 100644
> > > > --- a/Documentation/admin-guide/LSM/index.rst
> > > > +++ b/Documentation/admin-guide/LSM/index.rst
> > > > @@ -46,3 +46,4 @@ subdirectories.
> > > >     Smack
> > > >     tomoyo
> > > >     Yama
> > > > +   SafeSetID
> > > > diff --git a/security/Kconfig b/security/Kconfig
> > > > index 78dc12b7eeb3..9555f4914492 100644
> > > > --- a/security/Kconfig
> > > > +++ b/security/Kconfig
> > > > @@ -236,12 +236,13 @@ source "security/tomoyo/Kconfig"
> > > >  source "security/apparmor/Kconfig"
> > > >  source "security/loadpin/Kconfig"
> > > >  source "security/yama/Kconfig"
> > > > +source "security/safesetid/Kconfig"
> > > >
> > > >  source "security/integrity/Kconfig"
> > > >
> > > >  config LSM
> > > >       string "Ordered list of enabled LSMs"
> > > > -     default "yama,loadpin,integrity,selinux,smack,tomoyo,apparmor"
> > > > +     default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
> > > >       help
> > > >         A comma-separated list of LSMs, in initialization order.
> > > >         Any LSMs left off this list will be ignored. This can be
> > > > diff --git a/security/Makefile b/security/Makefile
> > > > index 4d2d3782ddef..c598b904938f 100644
> > > > --- a/security/Makefile
> > > > +++ b/security/Makefile
> > > > @@ -10,6 +10,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO)        += tomoyo
> > > >  subdir-$(CONFIG_SECURITY_APPARMOR)   += apparmor
> > > >  subdir-$(CONFIG_SECURITY_YAMA)               += yama
> > > >  subdir-$(CONFIG_SECURITY_LOADPIN)    += loadpin
> > > > +subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
> > > >
> > > >  # always enable default capabilities
> > > >  obj-y                                        += commoncap.o
> > > > @@ -25,6 +26,7 @@ obj-$(CONFIG_SECURITY_TOMOYO)               += tomoyo/
> > > >  obj-$(CONFIG_SECURITY_APPARMOR)              += apparmor/
> > > >  obj-$(CONFIG_SECURITY_YAMA)          += yama/
> > > >  obj-$(CONFIG_SECURITY_LOADPIN)               += loadpin/
> > > > +obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
> > > >  obj-$(CONFIG_CGROUP_DEVICE)          += device_cgroup.o
> > > >
> > > >  # Object integrity file lists
> > > > diff --git a/security/safesetid/Kconfig b/security/safesetid/Kconfig
> > > > new file mode 100644
> > > > index 000000000000..bf89a47ffcc8
> > > > --- /dev/null
> > > > +++ b/security/safesetid/Kconfig
> > > > @@ -0,0 +1,12 @@
> > > > +config SECURITY_SAFESETID
> > > > +        bool "Gate setid transitions to limit CAP_SET{U/G}ID capabilities"
> > > > +        default n
> > > > +        help
> > > > +          SafeSetID is an LSM module that gates the setid family of syscalls to
> > > > +          restrict UID/GID transitions from a given UID/GID to only those
> > > > +          approved by a system-wide whitelist. These restrictions also prohibit
> > > > +          the given UIDs/GIDs from obtaining auxiliary privileges associated
> > > > +          with CAP_SET{U/G}ID, such as allowing a user to set up user namespace
> > > > +          UID mappings.
> > > > +
> > > > +          If you are unsure how to answer this question, answer N.
> > > > diff --git a/security/safesetid/Makefile b/security/safesetid/Makefile
> > > > new file mode 100644
> > > > index 000000000000..6b0660321164
> > > > --- /dev/null
> > > > +++ b/security/safesetid/Makefile
> > > > @@ -0,0 +1,7 @@
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +#
> > > > +# Makefile for the safesetid LSM.
> > > > +#
> > > > +
> > > > +obj-$(CONFIG_SECURITY_SAFESETID) := safesetid.o
> > > > +safesetid-y := lsm.o securityfs.o
> > > > diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
> > > > new file mode 100644
> > > > index 000000000000..3a2c75ac810c
> > > > --- /dev/null
> > > > +++ b/security/safesetid/lsm.c
> > > > @@ -0,0 +1,277 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * SafeSetID Linux Security Module
> > > > + *
> > > > + * Author: Micah Morton <mortonm@chromium.org>
> > > > + *
> > > > + * Copyright (C) 2018 The Chromium OS Authors.
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or modify
> > > > + * it under the terms of the GNU General Public License version 2, as
> > > > + * published by the Free Software Foundation.
> > > > + *
> > > > + */
> > > > +
> > > > +#define pr_fmt(fmt) "SafeSetID: " fmt
> > > > +
> > > > +#include <asm/syscall.h>
> > > > +#include <linux/hashtable.h>
> > > > +#include <linux/lsm_hooks.h>
> > > > +#include <linux/module.h>
> > > > +#include <linux/ptrace.h>
> > > > +#include <linux/sched/task_stack.h>
> > > > +#include <linux/security.h>
> > > > +
> > > > +/* Flag indicating whether initialization completed */
> > > > +int safesetid_initialized;
> > > > +
> > > > +#define NUM_BITS 8 /* 128 buckets in hash table */
> > > > +
> > > > +static DEFINE_HASHTABLE(safesetid_whitelist_hashtable, NUM_BITS);
> > > > +
> > > > +/*
> > > > + * Hash table entry to store safesetid policy signifying that 'parent' user
> > > > + * can setid to 'child' user.
> > > > + */
> > > > +struct entry {
> > > > +     struct hlist_node next;
> > > > +     struct hlist_node dlist; /* for deletion cleanup */
> > > > +     uint64_t parent_kuid;
> > > > +     uint64_t child_kuid;
> > > > +};
> > > > +
> > > > +static DEFINE_SPINLOCK(safesetid_whitelist_hashtable_spinlock);
> > > > +
> > > > +static bool check_setuid_policy_hashtable_key(kuid_t parent)
> > > > +{
> > > > +     struct entry *entry;
> > > > +
> > > > +     rcu_read_lock();
> > > > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > > > +                                entry, next, __kuid_val(parent)) {
> > > > +             if (entry->parent_kuid == __kuid_val(parent)) {
> > > > +                     rcu_read_unlock();
> > > > +                     return true;
> > > > +             }
> > > > +     }
> > > > +     rcu_read_unlock();
> > > > +
> > > > +     return false;
> > > > +}
> > > > +
> > > > +static bool check_setuid_policy_hashtable_key_value(kuid_t parent,
> > > > +                                                 kuid_t child)
> > > > +{
> > > > +     struct entry *entry;
> > > > +
> > > > +     rcu_read_lock();
> > > > +     hash_for_each_possible_rcu(safesetid_whitelist_hashtable,
> > > > +                                entry, next, __kuid_val(parent)) {
> > > > +             if (entry->parent_kuid == __kuid_val(parent) &&
> > > > +                 entry->child_kuid == __kuid_val(child)) {
> > > > +                     rcu_read_unlock();
> > > > +                     return true;
> > > > +             }
> > > > +     }
> > > > +     rcu_read_unlock();
> > > > +
> > > > +     return false;
> > > > +}
> > > > +
> > > > +static int safesetid_security_capable(const struct cred *cred,
> > > > +                                   struct user_namespace *ns,
> > > > +                                   int cap,
> > > > +                                   unsigned int opts)
> > > > +{
> > > > +     if (cap == CAP_SETUID &&
> > > > +         check_setuid_policy_hashtable_key(cred->uid)) {
> > > > +             if (!(opts & CAP_OPT_INSETID)) {
> > > > +                     /*
> > > > +                      * Deny if we're not in a set*uid() syscall to avoid
> > > > +                      * giving powers gated by CAP_SETUID that are related
> > > > +                      * to functionality other than calling set*uid() (e.g.
> > > > +                      * allowing user to set up userns uid mappings).
> > > > +                      */
> > > > +                     pr_warn("Operation requires CAP_SETUID, which is not available to UID %u for operations besides approved set*uid transitions",
> > > > +                             __kuid_val(cred->uid));
> > > > +                     return -1;
> > > > +             }
> > > > +     }
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +static int check_uid_transition(kuid_t parent, kuid_t child)
> > > > +{
> > > > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > > > +             return 0;
> > > > +     pr_warn("UID transition (%d -> %d) blocked",
> > > > +             __kuid_val(parent),
> > > > +             __kuid_val(child));
> > > > +     /*
> > > > +      * Kill this process to avoid potential security vulnerabilities
> > > > +      * that could arise from a missing whitelist entry preventing a
> > > > +      * privileged process from dropping to a lesser-privileged one.
> > > > +      */
> > > > +     force_sig(SIGKILL, current);
> > > > +     return -EACCES;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Check whether there is either an exception for user under old cred struct to
> > > > + * set*uid to user under new cred struct, or the UID transition is allowed (by
> > > > + * Linux set*uid rules) even without CAP_SETUID.
> > > > + */
> > > > +static int safesetid_task_fix_setuid(struct cred *new,
> > > > +                                  const struct cred *old,
> > > > +                                  int flags)
> > > > +{
> > > > +
> > > > +     /* Do nothing if there are no setuid restrictions for this UID. */
> > > > +     if (!check_setuid_policy_hashtable_key(old->uid))
> > > > +             return 0;
> > > > +
> > > > +     switch (flags) {
> > > > +     case LSM_SETID_RE:
> > > > +             /*
> > > > +              * Users for which setuid restrictions exist can only set the
> > > > +              * real UID to the real UID or the effective UID, unless an
> > > > +              * explicit whitelist policy allows the transition.
> > > > +              */
> > > > +             if (!uid_eq(old->uid, new->uid) &&
> > > > +                     !uid_eq(old->euid, new->uid)) {
> > > > +                     return check_uid_transition(old->uid, new->uid);
> > > > +             }
> > > > +             /*
> > > > +              * Users for which setuid restrictions exist can only set the
> > > > +              * effective UID to the real UID, the effective UID, or the
> > > > +              * saved set-UID, unless an explicit whitelist policy allows
> > > > +              * the transition.
> > > > +              */
> > > > +             if (!uid_eq(old->uid, new->euid) &&
> > > > +                     !uid_eq(old->euid, new->euid) &&
> > > > +                     !uid_eq(old->suid, new->euid)) {
> > > > +                     return check_uid_transition(old->euid, new->euid);
> > > > +             }
> > > > +             break;
> > > > +     case LSM_SETID_ID:
> > > > +             /*
> > > > +              * Users for which setuid restrictions exist cannot change the
> > > > +              * real UID or saved set-UID unless an explicit whitelist
> > > > +              * policy allows the transition.
> > > > +              */
> > > > +             if (!uid_eq(old->uid, new->uid))
> > > > +                     return check_uid_transition(old->uid, new->uid);
> > > > +             if (!uid_eq(old->suid, new->suid))
> > > > +                     return check_uid_transition(old->suid, new->suid);
> > > > +             break;
> > > > +     case LSM_SETID_RES:
> > > > +             /*
> > > > +              * Users for which setuid restrictions exist cannot change the
> > > > +              * real UID, effective UID, or saved set-UID to anything but
> > > > +              * one of: the current real UID, the current effective UID or
> > > > +              * the current saved set-user-ID unless an explicit whitelist
> > > > +              * policy allows the transition.
> > > > +              */
> > > > +             if (!uid_eq(new->uid, old->uid) &&
> > > > +                     !uid_eq(new->uid, old->euid) &&
> > > > +                     !uid_eq(new->uid, old->suid)) {
> > > > +                     return check_uid_transition(old->uid, new->uid);
> > > > +             }
> > > > +             if (!uid_eq(new->euid, old->uid) &&
> > > > +                     !uid_eq(new->euid, old->euid) &&
> > > > +                     !uid_eq(new->euid, old->suid)) {
> > > > +                     return check_uid_transition(old->euid, new->euid);
> > > > +             }
> > > > +             if (!uid_eq(new->suid, old->uid) &&
> > > > +                     !uid_eq(new->suid, old->euid) &&
> > > > +                     !uid_eq(new->suid, old->suid)) {
> > > > +                     return check_uid_transition(old->suid, new->suid);
> > > > +             }
> > > > +             break;
> > > > +     case LSM_SETID_FS:
> > > > +             /*
> > > > +              * Users for which setuid restrictions exist cannot change the
> > > > +              * filesystem UID to anything but one of: the current real UID,
> > > > +              * the current effective UID or the current saved set-UID
> > > > +              * unless an explicit whitelist policy allows the transition.
> > > > +              */
> > > > +             if (!uid_eq(new->fsuid, old->uid)  &&
> > > > +                     !uid_eq(new->fsuid, old->euid)  &&
> > > > +                     !uid_eq(new->fsuid, old->suid) &&
> > > > +                     !uid_eq(new->fsuid, old->fsuid)) {
> > > > +                     return check_uid_transition(old->fsuid, new->fsuid);
> > > > +             }
> > > > +             break;
> > > > +     default:
> > > > +             pr_warn("Unknown setid state %d\n", flags);
> > > > +             force_sig(SIGKILL, current);
> > > > +             return -EINVAL;
> > > > +     }
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child)
> > > > +{
> > > > +     struct entry *new;
> > > > +
> > > > +     /* Return if entry already exists */
> > > > +     if (check_setuid_policy_hashtable_key_value(parent, child))
> > > > +             return 0;
> > > > +
> > > > +     new = kzalloc(sizeof(struct entry), GFP_KERNEL);
> > > > +     if (!new)
> > > > +             return -ENOMEM;
> > > > +     new->parent_kuid = __kuid_val(parent);
> > > > +     new->child_kuid = __kuid_val(child);
> > > > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > > > +     hash_add_rcu(safesetid_whitelist_hashtable,
> > > > +                  &new->next,
> > > > +                  __kuid_val(parent));
> > > > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +void flush_safesetid_whitelist_entries(void)
> > > > +{
> > > > +     struct entry *entry;
> > > > +     struct hlist_node *hlist_node;
> > > > +     unsigned int bkt_loop_cursor;
> > > > +     HLIST_HEAD(free_list);
> > > > +
> > > > +     /*
> > > > +      * Could probably use hash_for_each_rcu here instead, but this should
> > > > +      * be fine as well.
> > > > +      */
> > > > +     spin_lock(&safesetid_whitelist_hashtable_spinlock);
> > > > +     hash_for_each_safe(safesetid_whitelist_hashtable, bkt_loop_cursor,
> > > > +                        hlist_node, entry, next) {
> > > > +             hash_del_rcu(&entry->next);
> > > > +             hlist_add_head(&entry->dlist, &free_list);
> > > > +     }
> > > > +     spin_unlock(&safesetid_whitelist_hashtable_spinlock);
> > > > +     synchronize_rcu();
> > > > +     hlist_for_each_entry_safe(entry, hlist_node, &free_list, dlist) {
> > > > +             hlist_del(&entry->dlist);
> > > > +             kfree(entry);
> > > > +     }
> > > > +}
> > > > +
> > > > +static struct security_hook_list safesetid_security_hooks[] = {
> > > > +     LSM_HOOK_INIT(task_fix_setuid, safesetid_task_fix_setuid),
> > > > +     LSM_HOOK_INIT(capable, safesetid_security_capable)
> > > > +};
> > > > +
> > > > +static int __init safesetid_security_init(void)
> > > > +{
> > > > +     security_add_hooks(safesetid_security_hooks,
> > > > +                        ARRAY_SIZE(safesetid_security_hooks), "safesetid");
> > > > +
> > > > +     /* Report that SafeSetID successfully initialized */
> > > > +     safesetid_initialized = 1;
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +DEFINE_LSM(safesetid_security_init) = {
> > > > +     .init = safesetid_security_init,
> > > > +};
> > > > diff --git a/security/safesetid/lsm.h b/security/safesetid/lsm.h
> > > > new file mode 100644
> > > > index 000000000000..c1ea3c265fcf
> > > > --- /dev/null
> > > > +++ b/security/safesetid/lsm.h
> > > > @@ -0,0 +1,33 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +/*
> > > > + * SafeSetID Linux Security Module
> > > > + *
> > > > + * Author: Micah Morton <mortonm@chromium.org>
> > > > + *
> > > > + * Copyright (C) 2018 The Chromium OS Authors.
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or modify
> > > > + * it under the terms of the GNU General Public License version 2, as
> > > > + * published by the Free Software Foundation.
> > > > + *
> > > > + */
> > > > +#ifndef _SAFESETID_H
> > > > +#define _SAFESETID_H
> > > > +
> > > > +#include <linux/types.h>
> > > > +
> > > > +/* Flag indicating whether initialization completed */
> > > > +extern int safesetid_initialized;
> > > > +
> > > > +/* Function type. */
> > > > +enum safesetid_whitelist_file_write_type {
> > > > +     SAFESETID_WHITELIST_ADD, /* Add whitelist policy. */
> > > > +     SAFESETID_WHITELIST_FLUSH, /* Flush whitelist policies. */
> > > > +};
> > > > +
> > > > +/* Add entry to safesetid whitelist to allow 'parent' to setid to 'child'. */
> > > > +int add_safesetid_whitelist_entry(kuid_t parent, kuid_t child);
> > > > +
> > > > +void flush_safesetid_whitelist_entries(void);
> > > > +
> > > > +#endif /* _SAFESETID_H */
> > > > diff --git a/security/safesetid/securityfs.c b/security/safesetid/securityfs.c
> > > > new file mode 100644
> > > > index 000000000000..61be4ee459cc
> > > > --- /dev/null
> > > > +++ b/security/safesetid/securityfs.c
> > > > @@ -0,0 +1,193 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +/*
> > > > + * SafeSetID Linux Security Module
> > > > + *
> > > > + * Author: Micah Morton <mortonm@chromium.org>
> > > > + *
> > > > + * Copyright (C) 2018 The Chromium OS Authors.
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or modify
> > > > + * it under the terms of the GNU General Public License version 2, as
> > > > + * published by the Free Software Foundation.
> > > > + *
> > > > + */
> > > > +#include <linux/security.h>
> > > > +#include <linux/cred.h>
> > > > +
> > > > +#include "lsm.h"
> > > > +
> > > > +static struct dentry *safesetid_policy_dir;
> > > > +
> > > > +struct safesetid_file_entry {
> > > > +     const char *name;
> > > > +     enum safesetid_whitelist_file_write_type type;
> > > > +     struct dentry *dentry;
> > > > +};
> > > > +
> > > > +static struct safesetid_file_entry safesetid_files[] = {
> > > > +     {.name = "add_whitelist_policy",
> > > > +      .type = SAFESETID_WHITELIST_ADD},
> > > > +     {.name = "flush_whitelist_policies",
> > > > +      .type = SAFESETID_WHITELIST_FLUSH},
> > > > +};
> > > > +
> > > > +/*
> > > > + * In the case the input buffer contains one or more invalid UIDs, the kuid_t
> > > > + * variables pointed to by 'parent' and 'child' will get updated but this
> > > > + * function will return an error.
> > > > + */
> > > > +static int parse_safesetid_whitelist_policy(const char __user *buf,
> > > > +                                         size_t len,
> > > > +                                         kuid_t *parent,
> > > > +                                         kuid_t *child)
> > > > +{
> > > > +     char *kern_buf;
> > > > +     char *parent_buf;
> > > > +     char *child_buf;
> > > > +     const char separator[] = ":";
> > > > +     int ret;
> > > > +     size_t first_substring_length;
> > > > +     long parsed_parent;
> > > > +     long parsed_child;
> > > > +
> > > > +     /* Duplicate string from user memory and NULL-terminate */
> > > > +     kern_buf = memdup_user_nul(buf, len);
> > > > +     if (IS_ERR(kern_buf))
> > > > +             return PTR_ERR(kern_buf);
> > > > +
> > > > +     /*
> > > > +      * Format of |buf| string should be <UID>:<UID>.
> > > > +      * Find location of ":" in kern_buf (copied from |buf|).
> > > > +      */
> > > > +     first_substring_length = strcspn(kern_buf, separator);
> > > > +     if (first_substring_length == 0 || first_substring_length == len) {
> > > > +             ret = -EINVAL;
> > > > +             goto free_kern;
> > > > +     }
> > > > +
> > > > +     parent_buf = kmemdup_nul(kern_buf, first_substring_length, GFP_KERNEL);
> > > > +     if (!parent_buf) {
> > > > +             ret = -ENOMEM;
> > > > +             goto free_kern;
> > > > +     }
> > > > +
> > > > +     ret = kstrtol(parent_buf, 0, &parsed_parent);
> > > > +     if (ret)
> > > > +             goto free_both;
> > > > +
> > > > +     child_buf = kern_buf + first_substring_length + 1;
> > > > +     ret = kstrtol(child_buf, 0, &parsed_child);
> > > > +     if (ret)
> > > > +             goto free_both;
> > > > +
> > > > +     *parent = make_kuid(current_user_ns(), parsed_parent);
> > > > +     if (!uid_valid(*parent)) {
> > > > +             ret = -EINVAL;
> > > > +             goto free_both;
> > > > +     }
> > > > +
> > > > +     *child = make_kuid(current_user_ns(), parsed_child);
> > > > +     if (!uid_valid(*child)) {
> > > > +             ret = -EINVAL;
> > > > +             goto free_both;
> > > > +     }
> > > > +
> > > > +free_both:
> > > > +     kfree(parent_buf);
> > > > +free_kern:
> > > > +     kfree(kern_buf);
> > > > +     return ret;
> > > > +}
> > > > +
> > > > +static ssize_t safesetid_file_write(struct file *file,
> > > > +                                 const char __user *buf,
> > > > +                                 size_t len,
> > > > +                                 loff_t *ppos)
> > > > +{
> > > > +     struct safesetid_file_entry *file_entry =
> > > > +             file->f_inode->i_private;
> > > > +     kuid_t parent;
> > > > +     kuid_t child;
> > > > +     int ret;
> > > > +
> > > > +     if (!ns_capable(current_user_ns(), CAP_MAC_ADMIN))
> > > > +             return -EPERM;
> > > > +
> > > > +     if (*ppos != 0)
> > > > +             return -EINVAL;
> > > > +
> > > > +     switch (file_entry->type) {
> > > > +     case SAFESETID_WHITELIST_FLUSH:
> > > > +             flush_safesetid_whitelist_entries();
> > > > +             break;
> > > > +     case SAFESETID_WHITELIST_ADD:
> > > > +             ret = parse_safesetid_whitelist_policy(buf, len, &parent,
> > > > +                                                              &child);
> > > > +             if (ret)
> > > > +                     return ret;
> > > > +
> > > > +             ret = add_safesetid_whitelist_entry(parent, child);
> > > > +             if (ret)
> > > > +                     return ret;
> > > > +             break;
> > > > +     default:
> > > > +             pr_warn("Unknown securityfs file %d\n", file_entry->type);
> > > > +             break;
> > > > +     }
> > > > +
> > > > +     /* Return len on success so caller won't keep trying to write */
> > > > +     return len;
> > > > +}
> > > > +
> > > > +static const struct file_operations safesetid_file_fops = {
> > > > +     .write = safesetid_file_write,
> > > > +};
> > > > +
> > > > +static void safesetid_shutdown_securityfs(void)
> > > > +{
> > > > +     int i;
> > > > +
> > > > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > > > +             struct safesetid_file_entry *entry =
> > > > +                     &safesetid_files[i];
> > > > +             securityfs_remove(entry->dentry);
> > > > +             entry->dentry = NULL;
> > > > +     }
> > > > +
> > > > +     securityfs_remove(safesetid_policy_dir);
> > > > +     safesetid_policy_dir = NULL;
> > > > +}
> > > > +
> > > > +static int __init safesetid_init_securityfs(void)
> > > > +{
> > > > +     int i;
> > > > +     int ret;
> > > > +
> > > > +     if (!safesetid_initialized)
> > > > +             return 0;
> > > > +
> > > > +     safesetid_policy_dir = securityfs_create_dir("safesetid", NULL);
> > > > +     if (!safesetid_policy_dir) {
> > > > +             ret = PTR_ERR(safesetid_policy_dir);
> > > > +             goto error;
> > > > +     }
> > > > +
> > > > +     for (i = 0; i < ARRAY_SIZE(safesetid_files); ++i) {
> > > > +             struct safesetid_file_entry *entry =
> > > > +                     &safesetid_files[i];
> > > > +             entry->dentry = securityfs_create_file(
> > > > +                     entry->name, 0200, safesetid_policy_dir,
> > > > +                     entry, &safesetid_file_fops);
> > > > +             if (IS_ERR(entry->dentry)) {
> > > > +                     ret = PTR_ERR(entry->dentry);
> > > > +                     goto error;
> > > > +             }
> > > > +     }
> > > > +
> > > > +     return 0;
> > > > +
> > > > +error:
> > > > +     safesetid_shutdown_securityfs();
> > > > +     return ret;
> > > > +}
> > > > +fs_initcall(safesetid_init_securityfs);
> >
>
> --
> James Morris
> <jmorris@namei.org>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v3 1/2] LSM: add SafeSetID module that gates setid calls
  2019-01-22 22:40                                                     ` Micah Morton
@ 2019-01-22 22:42                                                       ` mortonm
  2019-01-25 15:51                                                         ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: mortonm @ 2019-01-22 22:42 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This change ensures that the set*uid family of syscalls in kernel/sys.c
(setreuid, setuid, setresuid, setfsuid) all call ns_capable_common with
the CAP_OPT_INSETID flag, so capability checks in the security_capable
hook can know whether they are being called from within a set*uid
syscall. This change is a no-op by itself, but is needed for the
proposed SafeSetID LSM.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
These changes used to be part of the main SafeSetID LSM patch set.

 include/linux/capability.h |  5 +++++
 kernel/capability.c        | 19 +++++++++++++++++++
 kernel/sys.c               | 10 +++++-----
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index f640dcbc880c..c3f9a4d558a0 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -209,6 +209,7 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
 extern bool capable(int cap);
 extern bool ns_capable(struct user_namespace *ns, int cap);
 extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
+extern bool ns_capable_setid(struct user_namespace *ns, int cap);
 #else
 static inline bool has_capability(struct task_struct *t, int cap)
 {
@@ -240,6 +241,10 @@ static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 {
 	return true;
 }
+static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
+{
+	return true;
+}
 #endif /* CONFIG_MULTIUSER */
 extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
diff --git a/kernel/capability.c b/kernel/capability.c
index 7718d7dcadc7..e0734ace5bc2 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -417,6 +417,25 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
 }
 EXPORT_SYMBOL(ns_capable_noaudit);
 
+/**
+ * ns_capable_setid - Determine if the current task has a superior capability
+ * in effect, while signalling that this check is being done from within a
+ * setid syscall.
+ * @ns:  The usernamespace we want the capability in
+ * @cap: The capability to be tested for
+ *
+ * Return true if the current task has the given superior capability currently
+ * available for use, false if not.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool ns_capable_setid(struct user_namespace *ns, int cap)
+{
+	return ns_capable_common(ns, cap, CAP_OPT_INSETID);
+}
+EXPORT_SYMBOL(ns_capable_setid);
+
 /**
  * capable - Determine if the current task has a superior capability in effect
  * @cap: The capability to be tested for
diff --git a/kernel/sys.c b/kernel/sys.c
index a48cbf1414b8..a98061c1a124 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -516,7 +516,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
 		new->uid = kruid;
 		if (!uid_eq(old->uid, kruid) &&
 		    !uid_eq(old->euid, kruid) &&
-		    !ns_capable(old->user_ns, CAP_SETUID))
+		    !ns_capable_setid(old->user_ns, CAP_SETUID))
 			goto error;
 	}
 
@@ -525,7 +525,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
 		if (!uid_eq(old->uid, keuid) &&
 		    !uid_eq(old->euid, keuid) &&
 		    !uid_eq(old->suid, keuid) &&
-		    !ns_capable(old->user_ns, CAP_SETUID))
+		    !ns_capable_setid(old->user_ns, CAP_SETUID))
 			goto error;
 	}
 
@@ -584,7 +584,7 @@ long __sys_setuid(uid_t uid)
 	old = current_cred();
 
 	retval = -EPERM;
-	if (ns_capable(old->user_ns, CAP_SETUID)) {
+	if (ns_capable_setid(old->user_ns, CAP_SETUID)) {
 		new->suid = new->uid = kuid;
 		if (!uid_eq(kuid, old->uid)) {
 			retval = set_user(new);
@@ -646,7 +646,7 @@ long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid)
 	old = current_cred();
 
 	retval = -EPERM;
-	if (!ns_capable(old->user_ns, CAP_SETUID)) {
+	if (!ns_capable_setid(old->user_ns, CAP_SETUID)) {
 		if (ruid != (uid_t) -1        && !uid_eq(kruid, old->uid) &&
 		    !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid))
 			goto error;
@@ -814,7 +814,7 @@ long __sys_setfsuid(uid_t uid)
 
 	if (uid_eq(kuid, old->uid)  || uid_eq(kuid, old->euid)  ||
 	    uid_eq(kuid, old->suid) || uid_eq(kuid, old->fsuid) ||
-	    ns_capable(old->user_ns, CAP_SETUID)) {
+	    ns_capable_setid(old->user_ns, CAP_SETUID)) {
 		if (!uid_eq(kuid, old->fsuid)) {
 			new->fsuid = kuid;
 			if (security_task_fix_setuid(new, old, LSM_SETID_FS) == 0)
-- 
2.20.1.97.g81188d93c3-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v3 1/2] LSM: add SafeSetID module that gates setid calls
  2019-01-22 22:42                                                       ` [PATCH v3 1/2] " mortonm
@ 2019-01-25 15:51                                                         ` Micah Morton
  0 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-25 15:51 UTC (permalink / raw)
  To: James Morris, Serge E. Hallyn, Kees Cook, Casey Schaufler,
	Stephen Smalley, linux-security-module

Patch set 1 of 2 was "Reviewed-by: Kees Cook <keescook@chromium.org>"
as well -- forgot to add that in the commit message above.

On Tue, Jan 22, 2019 at 2:42 PM <mortonm@chromium.org> wrote:
>
> From: Micah Morton <mortonm@chromium.org>
>
> This change ensures that the set*uid family of syscalls in kernel/sys.c
> (setreuid, setuid, setresuid, setfsuid) all call ns_capable_common with
> the CAP_OPT_INSETID flag, so capability checks in the security_capable
> hook can know whether they are being called from within a set*uid
> syscall. This change is a no-op by itself, but is needed for the
> proposed SafeSetID LSM.
>
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> ---
> These changes used to be part of the main SafeSetID LSM patch set.
>
>  include/linux/capability.h |  5 +++++
>  kernel/capability.c        | 19 +++++++++++++++++++
>  kernel/sys.c               | 10 +++++-----
>  3 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index f640dcbc880c..c3f9a4d558a0 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -209,6 +209,7 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
>  extern bool capable(int cap);
>  extern bool ns_capable(struct user_namespace *ns, int cap);
>  extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
> +extern bool ns_capable_setid(struct user_namespace *ns, int cap);
>  #else
>  static inline bool has_capability(struct task_struct *t, int cap)
>  {
> @@ -240,6 +241,10 @@ static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  {
>         return true;
>  }
> +static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
> +{
> +       return true;
> +}
>  #endif /* CONFIG_MULTIUSER */
>  extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
>  extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
> diff --git a/kernel/capability.c b/kernel/capability.c
> index 7718d7dcadc7..e0734ace5bc2 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -417,6 +417,25 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
>  }
>  EXPORT_SYMBOL(ns_capable_noaudit);
>
> +/**
> + * ns_capable_setid - Determine if the current task has a superior capability
> + * in effect, while signalling that this check is being done from within a
> + * setid syscall.
> + * @ns:  The usernamespace we want the capability in
> + * @cap: The capability to be tested for
> + *
> + * Return true if the current task has the given superior capability currently
> + * available for use, false if not.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +bool ns_capable_setid(struct user_namespace *ns, int cap)
> +{
> +       return ns_capable_common(ns, cap, CAP_OPT_INSETID);
> +}
> +EXPORT_SYMBOL(ns_capable_setid);
> +
>  /**
>   * capable - Determine if the current task has a superior capability in effect
>   * @cap: The capability to be tested for
> diff --git a/kernel/sys.c b/kernel/sys.c
> index a48cbf1414b8..a98061c1a124 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -516,7 +516,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
>                 new->uid = kruid;
>                 if (!uid_eq(old->uid, kruid) &&
>                     !uid_eq(old->euid, kruid) &&
> -                   !ns_capable(old->user_ns, CAP_SETUID))
> +                   !ns_capable_setid(old->user_ns, CAP_SETUID))
>                         goto error;
>         }
>
> @@ -525,7 +525,7 @@ long __sys_setreuid(uid_t ruid, uid_t euid)
>                 if (!uid_eq(old->uid, keuid) &&
>                     !uid_eq(old->euid, keuid) &&
>                     !uid_eq(old->suid, keuid) &&
> -                   !ns_capable(old->user_ns, CAP_SETUID))
> +                   !ns_capable_setid(old->user_ns, CAP_SETUID))
>                         goto error;
>         }
>
> @@ -584,7 +584,7 @@ long __sys_setuid(uid_t uid)
>         old = current_cred();
>
>         retval = -EPERM;
> -       if (ns_capable(old->user_ns, CAP_SETUID)) {
> +       if (ns_capable_setid(old->user_ns, CAP_SETUID)) {
>                 new->suid = new->uid = kuid;
>                 if (!uid_eq(kuid, old->uid)) {
>                         retval = set_user(new);
> @@ -646,7 +646,7 @@ long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid)
>         old = current_cred();
>
>         retval = -EPERM;
> -       if (!ns_capable(old->user_ns, CAP_SETUID)) {
> +       if (!ns_capable_setid(old->user_ns, CAP_SETUID)) {
>                 if (ruid != (uid_t) -1        && !uid_eq(kruid, old->uid) &&
>                     !uid_eq(kruid, old->euid) && !uid_eq(kruid, old->suid))
>                         goto error;
> @@ -814,7 +814,7 @@ long __sys_setfsuid(uid_t uid)
>
>         if (uid_eq(kuid, old->uid)  || uid_eq(kuid, old->euid)  ||
>             uid_eq(kuid, old->suid) || uid_eq(kuid, old->fsuid) ||
> -           ns_capable(old->user_ns, CAP_SETUID)) {
> +           ns_capable_setid(old->user_ns, CAP_SETUID)) {
>                 if (!uid_eq(kuid, old->fsuid)) {
>                         new->fsuid = kuid;
>                         if (security_task_fix_setuid(new, old, LSM_SETID_FS) == 0)
> --
> 2.20.1.97.g81188d93c3-goog
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-16 15:46                                             ` [PATCH v5 " mortonm
  2019-01-16 16:10                                               ` Casey Schaufler
@ 2019-01-25 20:15                                               ` James Morris
  2019-01-25 21:06                                                 ` Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: James Morris @ 2019-01-25 20:15 UTC (permalink / raw)
  To: Micah Morton; +Cc: serge, keescook, casey, sds, linux-security-module

On Wed, 16 Jan 2019, mortonm@chromium.org wrote:

> From: Micah Morton <mortonm@chromium.org>
> 
> SafeSetID gates the setid family of syscalls to restrict UID/GID
> transitions from a given UID/GID to only those approved by a
> system-wide whitelist. These restrictions also prohibit the given
> UIDs/GIDs from obtaining auxiliary privileges associated with
> CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> mappings. For now, only gating the set*uid family of syscalls is
> supported, with support for set*gid coming in a future patch set.
> 
> Signed-off-by: Micah Morton <mortonm@chromium.org>
> Acked-by: Kees Cook <keescook@chromium.org>

Both applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general


-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-25 20:15                                               ` [PATCH v5 2/2] " James Morris
@ 2019-01-25 21:06                                                 ` Micah Morton
  2019-01-28 19:47                                                   ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-25 21:06 UTC (permalink / raw)
  To: James Morris
  Cc: Serge E. Hallyn, Kees Cook, Casey Schaufler, Stephen Smalley,
	linux-security-module

Thanks!

On Fri, Jan 25, 2019 at 12:15 PM James Morris <jmorris@namei.org> wrote:
>
> On Wed, 16 Jan 2019, mortonm@chromium.org wrote:
>
> > From: Micah Morton <mortonm@chromium.org>
> >
> > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > transitions from a given UID/GID to only those approved by a
> > system-wide whitelist. These restrictions also prohibit the given
> > UIDs/GIDs from obtaining auxiliary privileges associated with
> > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > mappings. For now, only gating the set*uid family of syscalls is
> > supported, with support for set*gid coming in a future patch set.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > Acked-by: Kees Cook <keescook@chromium.org>
>
> Both applied to
> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general
>
>
> --
> James Morris
> <jmorris@namei.org>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-25 21:06                                                 ` Micah Morton
@ 2019-01-28 19:47                                                   ` Micah Morton
  2019-01-28 19:56                                                     ` Kees Cook
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-28 19:47 UTC (permalink / raw)
  To: James Morris
  Cc: Serge E. Hallyn, Kees Cook, Casey Schaufler, Stephen Smalley,
	linux-security-module

I'm getting the following crash when booting after compiling a kernel
with this LSM enabled, so I'll have to figure out what is going on.
All the "core" functionality of this LSM has been tested thoroughly
(we're already using this LSM on ChromeOS), but looks like there's
some debugging of the initialization that still needs to be done.

[    0.174285] LSM: Security Framework initializing
[    0.175277] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000000
[    0.176272] #PF error: [normal kernel read fault]
[    0.176272] PGD 0 P4D 0
[    0.176272] Oops: 0000 [#1] SMP PTI
[    0.176272] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc3+ #5
[    0.176272] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[    0.176272] RIP: 0010:strcmp+0x4/0x20
[    0.176272] Code: 09 48 83 c2 01 80 3a 00 75 f7 48 83 c6 01 0f b6
4e ff 48 83 c2 01 84 c9 88 4a ff 75 ed f3 c3 0f 1f 80 00 00 00 00 48
83 c7 01 <0f> b6 47 ff 48 83 c6 01 3a 46 ff 75 07 84 c0 75 eb 31 c0 c3
19 c0
[    0.176272] RSP: 0000:ffffffff88a03eb0 EFLAGS: 00010202
[    0.176272] RAX: 00000000ffffffff RBX: ffffffff89079bb0 RCX: 0000000000000000
[    0.176272] RDX: ffffa3f087411ec5 RSI: ffffa3f087411ec0 RDI: 0000000000000001
[    0.176272] RBP: ffffffff88815d93 R08: 000000000000002c R09: ffffa3f087411ec4
[    0.176272] R10: 000000000000002c R11: 00726f6d72617070 R12: ffffa3f087411ec0
[    0.176272] R13: ffffa3f087411ec0 R14: 0000000000000000 R15: 0000000000000000
[    0.176272] FS:  0000000000000000(0000) GS:ffffa3f087800000(0000)
knlGS:0000000000000000
[    0.176272] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.176272] CR2: 0000000000000000 CR3: 0000000005c0e000 CR4: 00000000000006b0
[    0.176272] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.176272] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.176272] Call Trace:
[    0.176272]  ordered_lsm_parse+0x112/0x20b
[    0.176272]  security_init+0x9b/0x3ab
[    0.176272]  start_kernel+0x413/0x479
[    0.176272]  secondary_startup_64+0xa4/0xb0
[    0.176272] Modules linked in:
[    0.176272] CR2: 0000000000000000
[    0.176272] ---[ end trace f2a8342a377681d5 ]---
[    0.176272] RIP: 0010:strcmp+0x4/0x20
[    0.176272] Code: 09 48 83 c2 01 80 3a 00 75 f7 48 83 c6 01 0f b6
4e ff 48 83 c2 01 84 c9 88 4a ff 75 ed f3 c3 0f 1f 80 00 00 00 00 48
83 c7 01 <0f> b6 47 ff 48 83 c6 01 3a 46 ff 75 07 84 c0 75 eb 31 c0 c3
19 c0
[    0.176272] RSP: 0000:ffffffff88a03eb0 EFLAGS: 00010202
[    0.176272] RAX: 00000000ffffffff RBX: ffffffff89079bb0 RCX: 0000000000000000
[    0.176272] RDX: ffffa3f087411ec5 RSI: ffffa3f087411ec0 RDI: 0000000000000001
[    0.176272] RBP: ffffffff88815d93 R08: 000000000000002c R09: ffffa3f087411ec4
[    0.176272] R10: 000000000000002c R11: 00726f6d72617070 R12: ffffa3f087411ec0
[    0.176272] R13: ffffa3f087411ec0 R14: 0000000000000000 R15: 0000000000000000
[    0.176272] FS:  0000000000000000(0000) GS:ffffa3f087800000(0000)
knlGS:0000000000000000
[    0.176272] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.176272] CR2: 0000000000000000 CR3: 0000000005c0e000 CR4: 00000000000006b0
[    0.176272] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.176272] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.176272] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.176272] ---[ end Kernel panic - not syncing: Attempted to kill
the idle task! ]---

On Fri, Jan 25, 2019 at 1:06 PM Micah Morton <mortonm@chromium.org> wrote:
>
> Thanks!
>
> On Fri, Jan 25, 2019 at 12:15 PM James Morris <jmorris@namei.org> wrote:
> >
> > On Wed, 16 Jan 2019, mortonm@chromium.org wrote:
> >
> > > From: Micah Morton <mortonm@chromium.org>
> > >
> > > SafeSetID gates the setid family of syscalls to restrict UID/GID
> > > transitions from a given UID/GID to only those approved by a
> > > system-wide whitelist. These restrictions also prohibit the given
> > > UIDs/GIDs from obtaining auxiliary privileges associated with
> > > CAP_SET{U/G}ID, such as allowing a user to set up user namespace UID
> > > mappings. For now, only gating the set*uid family of syscalls is
> > > supported, with support for set*gid coming in a future patch set.
> > >
> > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > Acked-by: Kees Cook <keescook@chromium.org>
> >
> > Both applied to
> > git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general
> >
> >
> > --
> > James Morris
> > <jmorris@namei.org>
> >

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-28 19:47                                                   ` Micah Morton
@ 2019-01-28 19:56                                                     ` Kees Cook
  2019-01-28 20:09                                                       ` James Morris
  2019-01-28 20:19                                                       ` Micah Morton
  0 siblings, 2 replies; 88+ messages in thread
From: Kees Cook @ 2019-01-28 19:56 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, Jan 29, 2019 at 8:47 AM Micah Morton <mortonm@chromium.org> wrote:
>
> I'm getting the following crash when booting after compiling a kernel
> with this LSM enabled, so I'll have to figure out what is going on.
> All the "core" functionality of this LSM has been tested thoroughly
> (we're already using this LSM on ChromeOS), but looks like there's
> some debugging of the initialization that still needs to be done.


+DEFINE_LSM(safesetid_security_init) = {
+       .init = safesetid_security_init,
+};

I think this is from not having:

.name = "safesetid",

I missed that in the review, sorry!

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-28 19:56                                                     ` Kees Cook
@ 2019-01-28 20:09                                                       ` James Morris
  2019-01-28 20:19                                                       ` Micah Morton
  1 sibling, 0 replies; 88+ messages in thread
From: James Morris @ 2019-01-28 20:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Micah Morton, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Tue, 29 Jan 2019, Kees Cook wrote:

> On Tue, Jan 29, 2019 at 8:47 AM Micah Morton <mortonm@chromium.org> wrote:
> >
> > I'm getting the following crash when booting after compiling a kernel
> > with this LSM enabled, so I'll have to figure out what is going on.
> > All the "core" functionality of this LSM has been tested thoroughly
> > (we're already using this LSM on ChromeOS), but looks like there's
> > some debugging of the initialization that still needs to be done.
> 
> 
> +DEFINE_LSM(safesetid_security_init) = {
> +       .init = safesetid_security_init,
> +};
> 
> I think this is from not having:
> 
> .name = "safesetid",
> 
> I missed that in the review, sorry!

Weird, I booted my system with safesetid stacked and it seemed to work.

-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-28 19:56                                                     ` Kees Cook
  2019-01-28 20:09                                                       ` James Morris
@ 2019-01-28 20:19                                                       ` Micah Morton
  2019-01-28 20:30                                                         ` [PATCH] LSM: Add 'name' field for SafeSetID in DEFINE_LSM mortonm
  2019-01-28 22:33                                                         ` [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
  1 sibling, 2 replies; 88+ messages in thread
From: Micah Morton @ 2019-01-28 20:19 UTC (permalink / raw)
  To: Kees Cook
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Mon, Jan 28, 2019 at 11:56 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Tue, Jan 29, 2019 at 8:47 AM Micah Morton <mortonm@chromium.org> wrote:
> >
> > I'm getting the following crash when booting after compiling a kernel
> > with this LSM enabled, so I'll have to figure out what is going on.
> > All the "core" functionality of this LSM has been tested thoroughly
> > (we're already using this LSM on ChromeOS), but looks like there's
> > some debugging of the initialization that still needs to be done.
>
>
> +DEFINE_LSM(safesetid_security_init) = {
> +       .init = safesetid_security_init,
> +};
>
> I think this is from not having:
>
> .name = "safesetid",

That fixed it for me! Thanks

>
> I missed that in the review, sorry!
>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH] LSM: Add 'name' field for SafeSetID in DEFINE_LSM
  2019-01-28 20:19                                                       ` Micah Morton
@ 2019-01-28 20:30                                                         ` mortonm
  2019-01-28 22:12                                                           ` James Morris
  2019-01-28 22:33                                                         ` [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
  1 sibling, 1 reply; 88+ messages in thread
From: mortonm @ 2019-01-28 20:30 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

Without this, system boot was crashing with:

[0.174285] LSM: Security Framework initializing
[0.175277] BUG: unable to handle kernel NULL pointer dereference
...
[0.176272] Call Trace:
[0.176272]  ordered_lsm_parse+0x112/0x20b
[0.176272]  security_init+0x9b/0x3ab
[0.176272]  start_kernel+0x413/0x479
[0.176272]  secondary_startup_64+0xa4/0xb0

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
 security/safesetid/lsm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/safesetid/lsm.c b/security/safesetid/lsm.c
index 3a2c75ac810c..282a242beb86 100644
--- a/security/safesetid/lsm.c
+++ b/security/safesetid/lsm.c
@@ -274,4 +274,5 @@ static int __init safesetid_security_init(void)
 
 DEFINE_LSM(safesetid_security_init) = {
 	.init = safesetid_security_init,
+	.name = "safesetid",
 };
-- 
2.20.1.495.gaa96b0ce6b-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: Add 'name' field for SafeSetID in DEFINE_LSM
  2019-01-28 20:30                                                         ` [PATCH] LSM: Add 'name' field for SafeSetID in DEFINE_LSM mortonm
@ 2019-01-28 22:12                                                           ` James Morris
  0 siblings, 0 replies; 88+ messages in thread
From: James Morris @ 2019-01-28 22:12 UTC (permalink / raw)
  To: Micah Morton; +Cc: serge, keescook, casey, sds, linux-security-module

On Mon, 28 Jan 2019, mortonm@chromium.org wrote:

> From: Micah Morton <mortonm@chromium.org>
> 
> Without this, system boot was crashing with:
> 
> [0.174285] LSM: Security Framework initializing
> [0.175277] BUG: unable to handle kernel NULL pointer dereference
> ...
> [0.176272] Call Trace:
> [0.176272]  ordered_lsm_parse+0x112/0x20b
> [0.176272]  security_init+0x9b/0x3ab
> [0.176272]  start_kernel+0x413/0x479
> [0.176272]  secondary_startup_64+0xa4/0xb0
> 
> Signed-off-by: Micah Morton <mortonm@chromium.org>

Applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general


-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-28 20:19                                                       ` Micah Morton
  2019-01-28 20:30                                                         ` [PATCH] LSM: Add 'name' field for SafeSetID in DEFINE_LSM mortonm
@ 2019-01-28 22:33                                                         ` Micah Morton
  2019-01-29 17:25                                                           ` James Morris
  1 sibling, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-28 22:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

FWIW, I've now done a manual test of this LSMs functionality on a
Linux VM built from the next-general branch. Adding policies, policy
enforcement by the LSM, and flushing policies all worked as intended.

So there hopefully won't be any more surprises.

On Mon, Jan 28, 2019 at 12:19 PM Micah Morton <mortonm@chromium.org> wrote:
>
> On Mon, Jan 28, 2019 at 11:56 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Tue, Jan 29, 2019 at 8:47 AM Micah Morton <mortonm@chromium.org> wrote:
> > >
> > > I'm getting the following crash when booting after compiling a kernel
> > > with this LSM enabled, so I'll have to figure out what is going on.
> > > All the "core" functionality of this LSM has been tested thoroughly
> > > (we're already using this LSM on ChromeOS), but looks like there's
> > > some debugging of the initialization that still needs to be done.
> >
> >
> > +DEFINE_LSM(safesetid_security_init) = {
> > +       .init = safesetid_security_init,
> > +};
> >
> > I think this is from not having:
> >
> > .name = "safesetid",
>
> That fixed it for me! Thanks
>
> >
> > I missed that in the review, sorry!
> >
> > --
> > Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-28 22:33                                                         ` [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
@ 2019-01-29 17:25                                                           ` James Morris
  2019-01-29 21:14                                                             ` Micah Morton
  0 siblings, 1 reply; 88+ messages in thread
From: James Morris @ 2019-01-29 17:25 UTC (permalink / raw)
  To: Micah Morton
  Cc: Kees Cook, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Mon, 28 Jan 2019, Micah Morton wrote:

> FWIW, I've now done a manual test of this LSMs functionality on a
> Linux VM built from the next-general branch. Adding policies, policy
> enforcement by the LSM, and flushing policies all worked as intended.
> 
> So there hopefully won't be any more surprises.

It would be useful to publish these as a testsuite, or include a test 
script in the kernel tree.


> 
> On Mon, Jan 28, 2019 at 12:19 PM Micah Morton <mortonm@chromium.org> wrote:
> >
> > On Mon, Jan 28, 2019 at 11:56 AM Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Tue, Jan 29, 2019 at 8:47 AM Micah Morton <mortonm@chromium.org> wrote:
> > > >
> > > > I'm getting the following crash when booting after compiling a kernel
> > > > with this LSM enabled, so I'll have to figure out what is going on.
> > > > All the "core" functionality of this LSM has been tested thoroughly
> > > > (we're already using this LSM on ChromeOS), but looks like there's
> > > > some debugging of the initialization that still needs to be done.
> > >
> > >
> > > +DEFINE_LSM(safesetid_security_init) = {
> > > +       .init = safesetid_security_init,
> > > +};
> > >
> > > I think this is from not having:
> > >
> > > .name = "safesetid",
> >
> > That fixed it for me! Thanks
> >
> > >
> > > I missed that in the review, sorry!
> > >
> > > --
> > > Kees Cook
> 

-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-29 17:25                                                           ` James Morris
@ 2019-01-29 21:14                                                             ` Micah Morton
  2019-01-30  7:15                                                               ` Kees Cook
  0 siblings, 1 reply; 88+ messages in thread
From: Micah Morton @ 2019-01-29 21:14 UTC (permalink / raw)
  To: James Morris
  Cc: Kees Cook, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

testsuite meaning Linux Test Project / Autotest? We have a ChromeOS
Autotest for this already
(https://chromium.googlesource.com/chromiumos/third_party/autotest/+/master/client/site_tests/security_ProcessManagementPolicy/security_ProcessManagementPolicy.py)
but it would at least need some adaptation for configuring/flushing
policies during the test. Not sure how different Linux Autotests are
from ChromeOS, if they are used at all.

Also, could you point me at the directory that holds such test scripts
in the kernel tree? Shouldn't be too difficult to port that ChromeOS
autotest to a script if we want to go that route.

On Tue, Jan 29, 2019 at 9:25 AM James Morris <jmorris@namei.org> wrote:
>
> On Mon, 28 Jan 2019, Micah Morton wrote:
>
> > FWIW, I've now done a manual test of this LSMs functionality on a
> > Linux VM built from the next-general branch. Adding policies, policy
> > enforcement by the LSM, and flushing policies all worked as intended.
> >
> > So there hopefully won't be any more surprises.
>
> It would be useful to publish these as a testsuite, or include a test
> script in the kernel tree.
>
>
> >
> > On Mon, Jan 28, 2019 at 12:19 PM Micah Morton <mortonm@chromium.org> wrote:
> > >
> > > On Mon, Jan 28, 2019 at 11:56 AM Kees Cook <keescook@chromium.org> wrote:
> > > >
> > > > On Tue, Jan 29, 2019 at 8:47 AM Micah Morton <mortonm@chromium.org> wrote:
> > > > >
> > > > > I'm getting the following crash when booting after compiling a kernel
> > > > > with this LSM enabled, so I'll have to figure out what is going on.
> > > > > All the "core" functionality of this LSM has been tested thoroughly
> > > > > (we're already using this LSM on ChromeOS), but looks like there's
> > > > > some debugging of the initialization that still needs to be done.
> > > >
> > > >
> > > > +DEFINE_LSM(safesetid_security_init) = {
> > > > +       .init = safesetid_security_init,
> > > > +};
> > > >
> > > > I think this is from not having:
> > > >
> > > > .name = "safesetid",
> > >
> > > That fixed it for me! Thanks
> > >
> > > >
> > > > I missed that in the review, sorry!
> > > >
> > > > --
> > > > Kees Cook
> >
>
> --
> James Morris
> <jmorris@namei.org>
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls
  2019-01-29 21:14                                                             ` Micah Morton
@ 2019-01-30  7:15                                                               ` Kees Cook
  2019-02-06 19:03                                                                 ` [PATCH] LSM: SafeSetID: add selftest mortonm
  0 siblings, 1 reply; 88+ messages in thread
From: Kees Cook @ 2019-01-30  7:15 UTC (permalink / raw)
  To: Micah Morton
  Cc: James Morris, Serge E. Hallyn, Casey Schaufler, Stephen Smalley,
	linux-security-module

On Wed, Jan 30, 2019 at 10:14 AM Micah Morton <mortonm@chromium.org> wrote:
>
> testsuite meaning Linux Test Project / Autotest? We have a ChromeOS
> Autotest for this already
> (https://chromium.googlesource.com/chromiumos/third_party/autotest/+/master/client/site_tests/security_ProcessManagementPolicy/security_ProcessManagementPolicy.py)
> but it would at least need some adaptation for configuring/flushing
> policies during the test. Not sure how different Linux Autotests are
> from ChromeOS, if they are used at all.
>
> Also, could you point me at the directory that holds such test scripts
> in the kernel tree? Shouldn't be too difficult to port that ChromeOS
> autotest to a script if we want to go that route.

The common place is tools/testing/selftests/name-here (for example,
see the "seccomp/" directory).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH] LSM: SafeSetID: add selftest
  2019-01-30  7:15                                                               ` Kees Cook
@ 2019-02-06 19:03                                                                 ` mortonm
  2019-02-06 19:26                                                                   ` Edwin Zimmerman
  2019-02-12 19:01                                                                   ` James Morris
  0 siblings, 2 replies; 88+ messages in thread
From: mortonm @ 2019-02-06 19:03 UTC (permalink / raw)
  To: jmorris, serge, keescook, casey, sds, linux-security-module; +Cc: Micah Morton

From: Micah Morton <mortonm@chromium.org>

This patch adds a selftest for the SafeSetID LSM. The test requires
mounting securityfs if it isn't mounted, creating test users in
/etc/passwd, and configuring policies for the SafeSetID LSM through
writes to securityfs.

Signed-off-by: Micah Morton <mortonm@chromium.org>
---
This test is reasonably robust for demonstrating the functionality of
the LSM, but is no masterpiece by any means. I'm not totally sure how
these tests are used. Are they incorporated into testing frameworks for
the Linux kernel that are run regularly or just PoC binaries that sit in
this directory more or less as documentation? If its the former, this
code probably needs some more cleanup and better organization. Beyond
coding style, the test doesn't bother to clean up users that were added
in /etc/passwd for testing purposes nor flushes policies that were
configured for the LSM relating to those users. Should it?

 tools/testing/selftests/safesetid/.gitignore  |   1 +
 tools/testing/selftests/safesetid/Makefile    |   8 +
 tools/testing/selftests/safesetid/config      |   2 +
 .../selftests/safesetid/safesetid-test.c      | 334 ++++++++++++++++++
 .../selftests/safesetid/safesetid-test.sh     |  26 ++
 5 files changed, 371 insertions(+)
 create mode 100644 tools/testing/selftests/safesetid/.gitignore
 create mode 100644 tools/testing/selftests/safesetid/Makefile
 create mode 100644 tools/testing/selftests/safesetid/config
 create mode 100644 tools/testing/selftests/safesetid/safesetid-test.c
 create mode 100755 tools/testing/selftests/safesetid/safesetid-test.sh

diff --git a/tools/testing/selftests/safesetid/.gitignore b/tools/testing/selftests/safesetid/.gitignore
new file mode 100644
index 000000000000..9c1a629bca01
--- /dev/null
+++ b/tools/testing/selftests/safesetid/.gitignore
@@ -0,0 +1 @@
+safesetid-test
diff --git a/tools/testing/selftests/safesetid/Makefile b/tools/testing/selftests/safesetid/Makefile
new file mode 100644
index 000000000000..98da7a504737
--- /dev/null
+++ b/tools/testing/selftests/safesetid/Makefile
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for mount selftests.
+CFLAGS = -Wall -lcap -O2
+
+TEST_PROGS := run_tests.sh
+TEST_GEN_FILES := safesetid-test
+
+include ../lib.mk
diff --git a/tools/testing/selftests/safesetid/config b/tools/testing/selftests/safesetid/config
new file mode 100644
index 000000000000..9d44e5c2e096
--- /dev/null
+++ b/tools/testing/selftests/safesetid/config
@@ -0,0 +1,2 @@
+CONFIG_SECURITY=y
+CONFIG_SECURITYFS=y
diff --git a/tools/testing/selftests/safesetid/safesetid-test.c b/tools/testing/selftests/safesetid/safesetid-test.c
new file mode 100644
index 000000000000..892c8e8b1b8b
--- /dev/null
+++ b/tools/testing/selftests/safesetid/safesetid-test.c
@@ -0,0 +1,334 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <pwd.h>
+#include <string.h>
+#include <syscall.h>
+#include <sys/capability.h>
+#include <sys/types.h>
+#include <sys/mount.h>
+#include <sys/prctl.h>
+#include <sys/wait.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <stdbool.h>
+#include <stdarg.h>
+
+#ifndef CLONE_NEWUSER
+# define CLONE_NEWUSER 0x10000000
+#endif
+
+#define ROOT_USER 0
+#define RESTRICTED_PARENT 1
+#define ALLOWED_CHILD1 2
+#define ALLOWED_CHILD2 3
+#define NO_POLICY_USER 4
+
+char* add_whitelist_policy_file = "/sys/kernel/security/safesetid/add_whitelist_policy";
+
+static void die(char *fmt, ...)
+{
+	va_list ap;
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+	exit(EXIT_FAILURE);
+}
+
+static bool vmaybe_write_file(bool enoent_ok, char *filename, char *fmt, va_list ap)
+{
+	char buf[4096];
+	int fd;
+	ssize_t written;
+	int buf_len;
+
+	buf_len = vsnprintf(buf, sizeof(buf), fmt, ap);
+	if (buf_len < 0) {
+		printf("vsnprintf failed: %s\n",
+		    strerror(errno));
+		return false;
+	}
+	if (buf_len >= sizeof(buf)) {
+		printf("vsnprintf output truncated\n");
+		return false;
+	}
+
+	fd = open(filename, O_WRONLY);
+	if (fd < 0) {
+		if ((errno == ENOENT) && enoent_ok)
+			return true;
+		return false;
+	}
+	written = write(fd, buf, buf_len);
+	if (written != buf_len) {
+		if (written >= 0) {
+			printf("short write to %s\n", filename);
+			return false;
+		} else {
+			printf("write to %s failed: %s\n",
+				filename, strerror(errno));
+			return false;
+		}
+	}
+	if (close(fd) != 0) {
+		printf("close of %s failed: %s\n",
+			filename, strerror(errno));
+		return false;
+	}
+	return true;
+}
+
+static bool write_file(char *filename, char *fmt, ...)
+{
+	va_list ap;
+	bool ret;
+
+	va_start(ap, fmt);
+	ret = vmaybe_write_file(false, filename, fmt, ap);
+	va_end(ap);
+
+	return ret;
+}
+
+static void ensure_user_exists(uid_t uid)
+{
+	struct passwd p;
+
+	FILE *fd;
+	char name_str[10];
+
+	if (getpwuid(uid) == NULL) {
+		memset(&p,0x00,sizeof(p));
+		fd=fopen("/etc/passwd","a");
+		if (fd == NULL)
+			die("couldn't open file\n");
+		if (fseek(fd, 0, SEEK_END))
+			die("couldn't fseek\n");
+		snprintf(name_str, 10, "%d", uid);
+		p.pw_name=name_str;
+		p.pw_uid=uid;
+		p.pw_gecos="Test account";
+		p.pw_dir="/dev/null";
+		p.pw_shell="/bin/false";
+		int value = putpwent(&p,fd);
+		if (value != 0)
+			die("putpwent failed\n");
+		if (fclose(fd))
+			die("fclose failed\n");
+	}
+}
+
+static void ensure_securityfs_mounted(void)
+{
+	int fd = open(add_whitelist_policy_file, O_WRONLY);
+	if (fd < 0) {
+		if (errno == ENOENT) {
+			// Need to mount securityfs
+			if (mount("securityfs", "/sys/kernel/security",
+						"securityfs", 0, NULL) < 0)
+				die("mounting securityfs failed\n");
+		} else {
+			die("couldn't find securityfs for unknown reason\n");
+		}
+	} else {
+		if (close(fd) != 0) {
+			die("close of %s failed: %s\n",
+				add_whitelist_policy_file, strerror(errno));
+		}
+	}
+}
+
+static void write_policies(void)
+{
+	ssize_t written;
+	int fd;
+
+	fd = open(add_whitelist_policy_file, O_WRONLY);
+	if (fd < 0)
+		die("cant open add_whitelist_policy file\n");
+	written = write(fd, "1:2", strlen("1:2"));
+	if (written != strlen("1:2")) {
+		if (written >= 0) {
+			die("short write to %s\n", add_whitelist_policy_file);
+		} else {
+			die("write to %s failed: %s\n",
+				add_whitelist_policy_file, strerror(errno));
+		}
+	}
+	written = write(fd, "1:3", strlen("1:3"));
+	if (written != strlen("1:3")) {
+		if (written >= 0) {
+			die("short write to %s\n", add_whitelist_policy_file);
+		} else {
+			die("write to %s failed: %s\n",
+				add_whitelist_policy_file, strerror(errno));
+		}
+	}
+	if (close(fd) != 0) {
+		die("close of %s failed: %s\n",
+			add_whitelist_policy_file, strerror(errno));
+	}
+}
+
+static bool test_userns(bool expect_success)
+{
+	uid_t uid;
+	char map_file_name[32];
+	size_t sz = sizeof(map_file_name);
+	pid_t cpid;
+	bool success;
+
+	uid = getuid();
+
+	int clone_flags = CLONE_NEWUSER;
+	cpid = syscall(SYS_clone, clone_flags, NULL);
+	if (cpid == -1) {
+	    printf("clone failed");
+	    return false;
+	}
+
+	if (cpid == 0) {	/* Code executed by child */
+		// Give parent 1 second to write map file
+		sleep(1);
+		exit(EXIT_SUCCESS);
+	} else {		/* Code executed by parent */
+		if(snprintf(map_file_name, sz, "/proc/%d/uid_map", cpid) < 0) {
+			printf("preparing file name string failed");
+			return false;
+		}
+		success = write_file(map_file_name, "0 0 1", uid);
+		return success == expect_success;
+	}
+
+	printf("should not reach here");
+	return false;
+}
+
+static void test_setuid(uid_t child_uid, bool expect_success)
+{
+	pid_t cpid, w;
+	int wstatus;
+
+	cpid = fork();
+	if (cpid == -1) {
+		die("fork\n");
+	}
+
+	if (cpid == 0) {	    /* Code executed by child */
+		setuid(child_uid);
+		if (getuid() == child_uid)
+			exit(EXIT_SUCCESS);
+		else
+			exit(EXIT_FAILURE);
+	} else {		 /* Code executed by parent */
+		do {
+			w = waitpid(cpid, &wstatus, WUNTRACED | WCONTINUED);
+			if (w == -1) {
+				die("waitpid\n");
+			}
+
+			if (WIFEXITED(wstatus)) {
+				if (WEXITSTATUS(wstatus) == EXIT_SUCCESS) {
+					if (expect_success) {
+						return;
+					} else {
+						die("unexpected success\n");
+					}
+				} else {
+					if (expect_success) {
+						die("unexpected failure\n");
+					} else {
+						return;
+					}
+				}
+			} else if (WIFSIGNALED(wstatus)) {
+				if (WTERMSIG(wstatus) == 9) {
+					if (expect_success)
+						die("killed unexpectedly\n");
+					else
+						return;
+				} else {
+					die("unexpected signal: %d\n", wstatus);
+				}
+			} else {
+				die("unexpected status: %d\n", wstatus);
+			}
+		} while (!WIFEXITED(wstatus) && !WIFSIGNALED(wstatus));
+	}
+
+	die("should not reach here\n");
+}
+
+static void ensure_users_exist(void)
+{
+	ensure_user_exists(ROOT_USER);
+	ensure_user_exists(RESTRICTED_PARENT);
+	ensure_user_exists(ALLOWED_CHILD1);
+	ensure_user_exists(ALLOWED_CHILD2);
+	ensure_user_exists(NO_POLICY_USER);
+}
+
+static void drop_caps(bool setid_retained)
+{
+	cap_value_t cap_values[] = {CAP_SETUID, CAP_SETGID};
+	cap_t caps;
+
+	caps = cap_get_proc();
+	if (setid_retained)
+		cap_set_flag(caps, CAP_EFFECTIVE, 2, cap_values, CAP_SET);
+	else
+		cap_clear(caps);
+	cap_set_proc(caps);
+	cap_free(caps);
+}
+
+int main(int argc, char **argv)
+{
+	ensure_users_exist();
+	ensure_securityfs_mounted();
+	write_policies();
+
+	if (prctl(PR_SET_KEEPCAPS, 1L))
+		die("Error with set keepcaps\n");
+
+	// First test to make sure we can write userns mappings from a user
+	// that doesn't have any restrictions (as long as it has CAP_SETUID);
+	setuid(NO_POLICY_USER);
+	setgid(NO_POLICY_USER);
+
+	// Take away all but setid caps
+	drop_caps(true);
+
+	// Need PR_SET_DUMPABLE flag set so we can write /proc/[pid]/uid_map
+	// from non-root parent process.
+	if (prctl(PR_SET_DUMPABLE, 1, 0, 0, 0))
+		die("Error with set dumpable\n");
+
+	if (!test_userns(true)) {
+		die("test_userns failed when it should work\n");
+	}
+
+	setuid(RESTRICTED_PARENT);
+	setgid(RESTRICTED_PARENT);
+
+	test_setuid(ROOT_USER, false);
+	test_setuid(ALLOWED_CHILD1, true);
+	test_setuid(ALLOWED_CHILD2, true);
+	test_setuid(NO_POLICY_USER, false);
+
+	if (!test_userns(false)) {
+		die("test_userns worked when it should fail\n");
+	}
+
+	// Now take away all caps
+	drop_caps(false);
+	test_setuid(2, false);
+	test_setuid(3, false);
+	test_setuid(4, false);
+
+	// NOTE: this test doesn't clean up users that were created in
+	// /etc/passwd or flush policies that were added to the LSM.
+	return EXIT_SUCCESS;
+}
diff --git a/tools/testing/selftests/safesetid/safesetid-test.sh b/tools/testing/selftests/safesetid/safesetid-test.sh
new file mode 100755
index 000000000000..e4fdce675c54
--- /dev/null
+++ b/tools/testing/selftests/safesetid/safesetid-test.sh
@@ -0,0 +1,26 @@
+#!/bin/bash
+
+TCID="safesetid-test.sh"
+errcode=0
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+check_root()
+{
+	uid=$(id -u)
+	if [ $uid -ne 0 ]; then
+		echo $TCID: must be run as root >&2
+		exit $ksft_skip
+	fi
+}
+
+main_function()
+{
+  check_root
+  ./safesetid-test
+}
+
+main_function
+echo "$TCID: done"
+exit $errcode
-- 
2.20.1.611.gfbb209baf1-goog


^ permalink raw reply related	[flat|nested] 88+ messages in thread

* RE: [PATCH] LSM: SafeSetID: add selftest
  2019-02-06 19:03                                                                 ` [PATCH] LSM: SafeSetID: add selftest mortonm
@ 2019-02-06 19:26                                                                   ` Edwin Zimmerman
  2019-02-07 21:54                                                                     ` Micah Morton
  2019-02-12 19:01                                                                   ` James Morris
  1 sibling, 1 reply; 88+ messages in thread
From: Edwin Zimmerman @ 2019-02-06 19:26 UTC (permalink / raw)
  To: mortonm, jmorris, serge, keescook, casey, sds, linux-security-module

> On Wednesday, February 06, 2019 2:03 PM Micah Morton wrote:
> > This patch adds a selftest for the SafeSetID LSM. The test requires
> > mounting securityfs if it isn't mounted, creating test users in
> > /etc/passwd, and configuring policies for the SafeSetID LSM through
> > writes to securityfs.
> >
> > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > ---
> > This test is reasonably robust for demonstrating the functionality of
> > the LSM, but is no masterpiece by any means. I'm not totally sure how
> > these tests are used. Are they incorporated into testing frameworks for
> > the Linux kernel that are run regularly or just PoC binaries that sit in
> > this directory more or less as documentation? If its the former, this
> > code probably needs some more cleanup and better organization. Beyond
> > coding style, the test doesn't bother to clean up users that were added
> > in /etc/passwd for testing purposes nor flushes policies that were
> > configured for the LSM relating to those users. Should it?
> 
> No good reason to leave the users, so I would suggest cleaning them up.
> All it would take would be several deluser commands
> in safesetid-test.sh.  Very simple.


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: SafeSetID: add selftest
  2019-02-06 19:26                                                                   ` Edwin Zimmerman
@ 2019-02-07 21:54                                                                     ` Micah Morton
  0 siblings, 0 replies; 88+ messages in thread
From: Micah Morton @ 2019-02-07 21:54 UTC (permalink / raw)
  To: Edwin Zimmerman
  Cc: James Morris, Serge E. Hallyn, Kees Cook, Casey Schaufler,
	Stephen Smalley, linux-security-module

Yeah, that would be simple, although maybe someone is counting on
those users to exist later. We could create special users on the
system for the purpose of this test that didn't exist before the test
(and delete them afterward), but then there are other setup/cleanup
questions, like:

- Do we unmount securityfs after the test? What if something was
counting on it being mounted or not mounted?
- Do we flush the SafeSetID LSM policies after the test? Note that the
LSM doesn't currently have the functionality to flush individual
policies, so what happens if something was counting on certain
policies (for other users) being configured for the LSM and we flush
those after running our test?

These questions were the reason I was hoping to get more info on the
kind of environment in which these selftests run. If the norm is to
boot up a VM, run one of these tests, then reboot/shutdown, most of
these questions don't need to be answered (we would still probably
want to fix user creation/deletion since that is persistent across
reboots).

On Wed, Feb 6, 2019 at 11:26 AM Edwin Zimmerman <edwin@211mainstreet.net> wrote:
>
> > On Wednesday, February 06, 2019 2:03 PM Micah Morton wrote:
> > > This patch adds a selftest for the SafeSetID LSM. The test requires
> > > mounting securityfs if it isn't mounted, creating test users in
> > > /etc/passwd, and configuring policies for the SafeSetID LSM through
> > > writes to securityfs.
> > >
> > > Signed-off-by: Micah Morton <mortonm@chromium.org>
> > > ---
> > > This test is reasonably robust for demonstrating the functionality of
> > > the LSM, but is no masterpiece by any means. I'm not totally sure how
> > > these tests are used. Are they incorporated into testing frameworks for
> > > the Linux kernel that are run regularly or just PoC binaries that sit in
> > > this directory more or less as documentation? If its the former, this
> > > code probably needs some more cleanup and better organization. Beyond
> > > coding style, the test doesn't bother to clean up users that were added
> > > in /etc/passwd for testing purposes nor flushes policies that were
> > > configured for the LSM relating to those users. Should it?
> >
> > No good reason to leave the users, so I would suggest cleaning them up.
> > All it would take would be several deluser commands
> > in safesetid-test.sh.  Very simple.
>

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH] LSM: SafeSetID: add selftest
  2019-02-06 19:03                                                                 ` [PATCH] LSM: SafeSetID: add selftest mortonm
  2019-02-06 19:26                                                                   ` Edwin Zimmerman
@ 2019-02-12 19:01                                                                   ` James Morris
  1 sibling, 0 replies; 88+ messages in thread
From: James Morris @ 2019-02-12 19:01 UTC (permalink / raw)
  To: Micah Morton; +Cc: serge, keescook, casey, sds, linux-security-module

On Wed, 6 Feb 2019, mortonm@chromium.org wrote:

> From: Micah Morton <mortonm@chromium.org>
> 
> This patch adds a selftest for the SafeSetID LSM. The test requires
> mounting securityfs if it isn't mounted, creating test users in
> /etc/passwd, and configuring policies for the SafeSetID LSM through
> writes to securityfs.
> 
> Signed-off-by: Micah Morton <mortonm@chromium.org>

Great!

Applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-general


-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2019-02-12 19:02 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-31 15:28 [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
2018-10-31 21:02 ` Serge E. Hallyn
2018-10-31 21:57   ` Kees Cook
2018-10-31 22:37     ` Casey Schaufler
2018-11-01  1:12       ` Micah Morton
2018-11-01  6:13         ` Serge E. Hallyn
2018-11-01 15:39           ` Casey Schaufler
2018-11-01 15:56             ` Serge E. Hallyn
2018-11-01 16:18             ` Micah Morton
2018-11-01  6:07   ` Serge E. Hallyn
2018-11-01 16:11     ` Micah Morton
2018-11-01 16:22       ` Micah Morton
2018-11-01 16:41       ` Micah Morton
2018-11-01 17:08       ` Casey Schaufler
2018-11-01 19:52         ` Micah Morton
2018-11-02 16:05           ` Casey Schaufler
2018-11-02 17:12             ` Micah Morton
2018-11-02 18:19               ` Casey Schaufler
2018-11-02 18:30                 ` Serge E. Hallyn
2018-11-02 19:02                   ` Casey Schaufler
2018-11-02 19:22                     ` Serge E. Hallyn
2018-11-08 20:53                       ` Micah Morton
2018-11-08 21:34                         ` Casey Schaufler
2018-11-09  0:30                           ` Micah Morton
2018-11-09 23:21                             ` [PATCH] LSM: generalize flag passing to security_capable mortonm
2018-11-21 16:54                             ` [PATCH] LSM: add SafeSetID module that gates setid calls mortonm
2018-12-06  0:08                               ` Kees Cook
2018-12-06 17:51                                 ` Micah Morton
2019-01-11 17:13                                 ` [PATCH v2] " mortonm
2019-01-15  0:38                                   ` Kees Cook
2019-01-15 18:04                                     ` [PATCH v3 1/2] LSM: mark all set*uid call sites in kernel/sys.c mortonm
2019-01-15 19:34                                       ` Kees Cook
2019-01-15 18:04                                     ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls mortonm
2019-01-15 19:44                                       ` Kees Cook
2019-01-15 21:50                                         ` [PATCH v4 " mortonm
2019-01-15 22:32                                           ` Kees Cook
2019-01-16 15:46                                             ` [PATCH v5 " mortonm
2019-01-16 16:10                                               ` Casey Schaufler
2019-01-22 20:40                                                 ` Micah Morton
2019-01-22 22:28                                                   ` James Morris
2019-01-22 22:40                                                     ` Micah Morton
2019-01-22 22:42                                                       ` [PATCH v3 1/2] " mortonm
2019-01-25 15:51                                                         ` Micah Morton
2019-01-25 20:15                                               ` [PATCH v5 2/2] " James Morris
2019-01-25 21:06                                                 ` Micah Morton
2019-01-28 19:47                                                   ` Micah Morton
2019-01-28 19:56                                                     ` Kees Cook
2019-01-28 20:09                                                       ` James Morris
2019-01-28 20:19                                                       ` Micah Morton
2019-01-28 20:30                                                         ` [PATCH] LSM: Add 'name' field for SafeSetID in DEFINE_LSM mortonm
2019-01-28 22:12                                                           ` James Morris
2019-01-28 22:33                                                         ` [PATCH v5 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
2019-01-29 17:25                                                           ` James Morris
2019-01-29 21:14                                                             ` Micah Morton
2019-01-30  7:15                                                               ` Kees Cook
2019-02-06 19:03                                                                 ` [PATCH] LSM: SafeSetID: add selftest mortonm
2019-02-06 19:26                                                                   ` Edwin Zimmerman
2019-02-07 21:54                                                                     ` Micah Morton
2019-02-12 19:01                                                                   ` James Morris
2019-01-15 21:58                                         ` [PATCH v3 2/2] LSM: add SafeSetID module that gates setid calls Micah Morton
2019-01-15 19:49                                     ` [PATCH v2] " Micah Morton
2019-01-15 19:53                                       ` Kees Cook
2019-01-15  4:07                                   ` James Morris
2019-01-15 19:42                                     ` Micah Morton
2018-11-02 19:28                 ` [PATCH] " Micah Morton
2018-11-06 19:09                 ` [PATCH v2] " mortonm
2018-11-06 20:59       ` [PATCH] " James Morris
2018-11-06 21:21         ` [PATCH v3] " mortonm
2018-11-02 18:07 ` [PATCH] " Stephen Smalley
2018-11-02 19:13   ` Micah Morton
2018-11-19 18:54   ` [PATCH] [PATCH] LSM: generalize flag passing to security_capable mortonm
2018-12-13 22:29     ` Micah Morton
2018-12-13 23:09       ` Casey Schaufler
2018-12-14  0:05         ` Micah Morton
2018-12-18 22:37         ` [PATCH v2] " mortonm
2019-01-07 17:55           ` Micah Morton
2019-01-07 18:16             ` Casey Schaufler
2019-01-07 18:36               ` Micah Morton
2019-01-07 18:46                 ` Casey Schaufler
2019-01-07 19:02                   ` Micah Morton
2019-01-07 22:57                     ` [PATCH v3] " mortonm
2019-01-07 23:13           ` [PATCH v2] " Kees Cook
2019-01-08  0:10             ` [PATCH v4] " mortonm
2019-01-08  0:20               ` Kees Cook
2019-01-09 18:39                 ` Micah Morton
2019-01-10 22:31               ` James Morris
2019-01-10 23:03                 ` Micah Morton
2019-01-08  0:10             ` [PATCH v2] " Micah Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).