All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] Introduce a SELinux namespace
@ 2017-10-02 15:58 Stephen Smalley
  2017-10-02 15:58 ` [RFC 01/10] selinux: introduce a selinux namespace Stephen Smalley
                   ` (9 more replies)
  0 siblings, 10 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

I normally wouldn't post these patches at this stage of development,
but several people have requested them, so here they are.  Note that
they are very incomplete and unsafe and should not be used on any
production systems.  The first four patches should actually be safe,
since they merely lay the groundwork of enabling selinux state to be
namespaced, but the rest are not; specific known issues with each of them
are noted in the patch descriptions.  It isn't until the next to last patch
that the facility is even exposed to userspace, and that patch description
explains sample usage (as well as summarizing known issues).  I am
intentionally only sending this to the selinux list at the moment
because I don't think it is ready for wider consumption and expect much
of it to change or be completely replaced.  I had some other patches in
the works as well, but they were lost in a recent hardware failure so it
will take some time to recover those.

Motivating use cases for a SELinux namespace include:
1. Enabling one to apply SELinux confinement within a container on
a host that is itself using SELinux to enforce container isolation
and confinement to host resources (svirt).  For example, one might wish
to isolate multiple services running within a container, or to
enforce a W^X policy for a service running within a container.
Today one is forced to treat the entire container as a single
context and from within the container it appears that SELinux
is disabled.

2. Supporting the ChromeOS use case of running an Android SELinux
container when the host itself is not using SELinux.  My impression
is that the ChromeOS developers first tried hacking support for
a per-pid-namespace SELinux enforcing mode into the kernel, and then
later resorted to essentially running the ChromeOS processes in
an unconfined or permissive domain while running the Android
processes in their usual contexts; I don't know how this could have
passed Android CTS however since the full policy would have been
exposed to the Android instance via the single selinuxfs instance.

3. Running multiple Android instances on a single host, each with
their own SELinux policy and enforcing mode, as in the Cells/Cellrox
virtual smartphone platform.

4. Running Fedora or other SELinux-enabled systems with SELinux
confinement enabled in containers on non-SELinux hosts.

It should be noted that in their current form, these patches do not
yet support any of these use cases.

You can also find these patches in the following tree:
https://github.com/stephensmalley/selinux-kernel/tree/selinuxns

Use at your own risk.  Enjoy!

Stephen Smalley (10):
  selinux: introduce a selinux namespace
  selinux: support multiple selinuxfs instances
  selinux: move the AVC into the selinux namespace
  netns,selinux: create the selinux netlink socket per network namespace
  selinux: support per-task/cred selinux namespace
  selinux: introduce cred_selinux_ns() and use it
  selinux: support per-namespace inode security structures
  selinux: support per-namespace superblock security structures
  selinux: add a selinuxfs interface to unshare selinux namespace
  selinuxfs: restrict write operations to the same selinux namespace

 include/net/net_namespace.h            |    3 +
 security/selinux/avc.c                 |  290 ++++----
 security/selinux/hooks.c               |  884 ++++++++++++++++++-------
 security/selinux/ibpkey.c              |    3 +-
 security/selinux/include/avc.h         |   38 +-
 security/selinux/include/avc_ss.h      |    9 +-
 security/selinux/include/classmap.h    |    3 +-
 security/selinux/include/conditional.h |   11 +-
 security/selinux/include/objsec.h      |   18 +-
 security/selinux/include/security.h    |  231 +++++--
 security/selinux/netif.c               |    2 +-
 security/selinux/netlabel.c            |   14 +-
 security/selinux/netlink.c             |   31 +-
 security/selinux/netnode.c             |    4 +-
 security/selinux/netport.c             |    2 +-
 security/selinux/selinuxfs.c           |  627 ++++++++++++------
 security/selinux/ss/avtab.c            |    9 +-
 security/selinux/ss/avtab.h            |    3 -
 security/selinux/ss/ebitmap.c          |    7 +-
 security/selinux/ss/ebitmap.h          |    3 -
 security/selinux/ss/hashtab.c          |    8 +-
 security/selinux/ss/hashtab.h          |    4 -
 security/selinux/ss/mls.c              |   72 +-
 security/selinux/ss/mls.h              |   38 +-
 security/selinux/ss/services.c         | 1126 ++++++++++++++++++--------------
 security/selinux/ss/services.h         |   23 +-
 security/selinux/ss/status.c           |   47 +-
 security/selinux/xfrm.c                |   23 +-
 28 files changed, 2289 insertions(+), 1244 deletions(-)

-- 
2.9.5

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [RFC 01/10] selinux: introduce a selinux namespace
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2018-02-06 22:18   ` Paul Moore
  2017-10-02 15:58 ` [RFC 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Define a selinux namespace structure (struct selinux_ns)
for SELinux state and pass it explicitly to all security server
functions.  The public portion of the structure contains state
that is used throughout the SELinux code, such as the enforcing mode.
The structure also contains a pointer to a selinux_ss structure whose
definition is private to the security server and contains security
server specific state such as the policy database and SID table.

This change allocates a single selinux namespace, the init_selinux_ns.
It defines and passes a symbol for the current selinux namespace
(current_selinux_ns) as a placeholder for future changes where
multiple selinux namespaces will be supported, but in this change
the current selinux namespace is always the init selinux namespace.
Note that passing the current selinux namespace is not correct for
all hooks; some hooks will need to be adjusted to pass the selinux
namespace associated with an open file, a network namespace or socket,
etc, since not all hooks are invoked in process context and some
hooks operate in the context of a cred that may differ from current's
cred.  Fixing all of these cases is left to future changes, once
we introduce the support for multiple selinux namespaces.

This change by itself should have no effect on SELinux behavior or
APIs (userspace or LSM).  It merely wraps SELinux state and passes it
explicitly as needed.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/avc.c                 |   13 +-
 security/selinux/hooks.c               |  226 +++++--
 security/selinux/ibpkey.c              |    3 +-
 security/selinux/include/avc.h         |    6 -
 security/selinux/include/avc_ss.h      |    6 -
 security/selinux/include/conditional.h |   11 +-
 security/selinux/include/objsec.h      |    2 -
 security/selinux/include/security.h    |  213 +++++--
 security/selinux/netif.c               |    2 +-
 security/selinux/netlabel.c            |   11 +-
 security/selinux/netnode.c             |    4 +-
 security/selinux/netport.c             |    2 +-
 security/selinux/selinuxfs.c           |  122 ++--
 security/selinux/ss/avtab.c            |    9 +-
 security/selinux/ss/avtab.h            |    3 -
 security/selinux/ss/ebitmap.c          |    7 +-
 security/selinux/ss/ebitmap.h          |    3 -
 security/selinux/ss/hashtab.c          |    8 +-
 security/selinux/ss/hashtab.h          |    4 -
 security/selinux/ss/mls.c              |   72 ++-
 security/selinux/ss/mls.h              |   38 +-
 security/selinux/ss/services.c         | 1087 ++++++++++++++++++--------------
 security/selinux/ss/services.h         |   23 +-
 security/selinux/ss/status.c           |   45 +-
 security/selinux/xfrm.c                |    6 +-
 25 files changed, 1163 insertions(+), 763 deletions(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 2380b8d..a5a4d05a 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -149,7 +149,8 @@ static void avc_dump_query(struct audit_buffer *ab, u32 ssid, u32 tsid, u16 tcla
 	char *scontext;
 	u32 scontext_len;
 
-	rc = security_sid_to_context(ssid, &scontext, &scontext_len);
+	rc = security_sid_to_context(current_selinux_ns, ssid,
+				     &scontext, &scontext_len);
 	if (rc)
 		audit_log_format(ab, "ssid=%d", ssid);
 	else {
@@ -157,7 +158,8 @@ static void avc_dump_query(struct audit_buffer *ab, u32 ssid, u32 tsid, u16 tcla
 		kfree(scontext);
 	}
 
-	rc = security_sid_to_context(tsid, &scontext, &scontext_len);
+	rc = security_sid_to_context(current_selinux_ns, tsid,
+				     &scontext, &scontext_len);
 	if (rc)
 		audit_log_format(ab, " tsid=%d", tsid);
 	else {
@@ -969,7 +971,8 @@ static noinline struct avc_node *avc_compute_av(u32 ssid, u32 tsid,
 {
 	rcu_read_unlock();
 	INIT_LIST_HEAD(&xp_node->xpd_head);
-	security_compute_av(ssid, tsid, tclass, avd, &xp_node->xp);
+	security_compute_av(current_selinux_ns, ssid, tsid, tclass,
+			    avd, &xp_node->xp);
 	rcu_read_lock();
 	return avc_insert(ssid, tsid, tclass, avd, xp_node);
 }
@@ -1043,8 +1046,8 @@ int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
 			goto decision;
 		}
 		rcu_read_unlock();
-		security_compute_xperms_decision(ssid, tsid, tclass, driver,
-						&local_xpd);
+		security_compute_xperms_decision(current_selinux_ns, ssid, tsid,
+						 tclass, driver, &local_xpd);
 		rcu_read_lock();
 		avc_update_node(AVC_CALLBACK_ADD_XPERMS, requested, driver, xperm,
 				ssid, tsid, tclass, avd.seqno, &local_xpd, 0);
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index f5d3047..9eb48a1 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -97,20 +97,24 @@
 #include "audit.h"
 #include "avc_ss.h"
 
+struct selinux_ns *init_selinux_ns;
+
 /* SECMARK reference count */
 static atomic_t selinux_secmark_refcount = ATOMIC_INIT(0);
 
 #ifdef CONFIG_SECURITY_SELINUX_DEVELOP
-int selinux_enforcing;
+static int selinux_enforcing_boot;
 
 static int __init enforcing_setup(char *str)
 {
 	unsigned long enforcing;
 	if (!kstrtoul(str, 0, &enforcing))
-		selinux_enforcing = enforcing ? 1 : 0;
+		selinux_enforcing_boot = enforcing ? 1 : 0;
 	return 1;
 }
 __setup("enforcing=", enforcing_setup);
+#else
+#define selinux_enforcing_boot 1
 #endif
 
 #ifdef CONFIG_SECURITY_SELINUX_BOOTPARAM
@@ -128,6 +132,19 @@ __setup("selinux=", selinux_enabled_setup);
 int selinux_enabled = 1;
 #endif
 
+static unsigned int selinux_checkreqprot_boot =
+	CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE;
+
+static int __init checkreqprot_setup(char *str)
+{
+	unsigned long checkreqprot;
+
+	if (!kstrtoul(str, 0, &checkreqprot))
+		selinux_checkreqprot_boot = checkreqprot ? 1 : 0;
+	return 1;
+}
+__setup("checkreqprot=", checkreqprot_setup);
+
 static struct kmem_cache *sel_inode_cache;
 static struct kmem_cache *file_security_cache;
 
@@ -616,21 +633,25 @@ static int selinux_get_mnt_opts(const struct super_block *sb,
 
 	i = 0;
 	if (sbsec->flags & FSCONTEXT_MNT) {
-		rc = security_sid_to_context(sbsec->sid, &context, &len);
+		rc = security_sid_to_context(current_selinux_ns, sbsec->sid,
+					     &context, &len);
 		if (rc)
 			goto out_free;
 		opts->mnt_opts[i] = context;
 		opts->mnt_opts_flags[i++] = FSCONTEXT_MNT;
 	}
 	if (sbsec->flags & CONTEXT_MNT) {
-		rc = security_sid_to_context(sbsec->mntpoint_sid, &context, &len);
+		rc = security_sid_to_context(current_selinux_ns,
+					     sbsec->mntpoint_sid,
+					     &context, &len);
 		if (rc)
 			goto out_free;
 		opts->mnt_opts[i] = context;
 		opts->mnt_opts_flags[i++] = CONTEXT_MNT;
 	}
 	if (sbsec->flags & DEFCONTEXT_MNT) {
-		rc = security_sid_to_context(sbsec->def_sid, &context, &len);
+		rc = security_sid_to_context(current_selinux_ns, sbsec->def_sid,
+					     &context, &len);
 		if (rc)
 			goto out_free;
 		opts->mnt_opts[i] = context;
@@ -640,7 +661,8 @@ static int selinux_get_mnt_opts(const struct super_block *sb,
 		struct dentry *root = sbsec->sb->s_root;
 		struct inode_security_struct *isec = backing_inode_security(root);
 
-		rc = security_sid_to_context(isec->sid, &context, &len);
+		rc = security_sid_to_context(current_selinux_ns, isec->sid,
+					     &context, &len);
 		if (rc)
 			goto out_free;
 		opts->mnt_opts[i] = context;
@@ -749,7 +771,9 @@ static int selinux_set_mnt_opts(struct super_block *sb,
 
 		if (flags[i] == SBLABEL_MNT)
 			continue;
-		rc = security_context_str_to_sid(mount_options[i], &sid, GFP_KERNEL);
+		rc = security_context_str_to_sid(current_selinux_ns,
+						 mount_options[i], &sid,
+						 GFP_KERNEL);
 		if (rc) {
 			printk(KERN_WARNING "SELinux: security_context_str_to_sid"
 			       "(%s) failed for (dev %s, type %s) errno=%d\n",
@@ -825,7 +849,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
 		 * Determine the labeling behavior to use for this
 		 * filesystem type.
 		 */
-		rc = security_fs_use(sb);
+		rc = security_fs_use(current_selinux_ns, sb);
 		if (rc) {
 			printk(KERN_WARNING
 				"%s: security_fs_use(%s) returned %d\n",
@@ -850,7 +874,9 @@ static int selinux_set_mnt_opts(struct super_block *sb,
 		}
 		if (sbsec->behavior == SECURITY_FS_USE_XATTR) {
 			sbsec->behavior = SECURITY_FS_USE_MNTPOINT;
-			rc = security_transition_sid(current_sid(), current_sid(),
+			rc = security_transition_sid(current_selinux_ns,
+						     current_sid(),
+						     current_sid(),
 						     SECCLASS_FILE, NULL,
 						     &sbsec->mntpoint_sid);
 			if (rc)
@@ -1013,7 +1039,7 @@ static int selinux_sb_clone_mnt_opts(const struct super_block *oldsb,
 
 	if (newsbsec->behavior == SECURITY_FS_USE_NATIVE &&
 		!(kern_flags & SECURITY_LSM_NATIVE_LABELS) && !set_context) {
-		rc = security_fs_use(newsb);
+		rc = security_fs_use(current_selinux_ns, newsb);
 		if (rc)
 			goto out;
 	}
@@ -1470,7 +1496,8 @@ static int selinux_genfs_get_sid(struct dentry *dentry,
 				path++;
 			}
 		}
-		rc = security_genfs_sid(sb->s_type->name, path, tclass, sid);
+		rc = security_genfs_sid(current_selinux_ns, sb->s_type->name,
+					path, tclass, sid);
 	}
 	free_page((unsigned long)buffer);
 	return rc;
@@ -1588,7 +1615,8 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
 			sid = sbsec->def_sid;
 			rc = 0;
 		} else {
-			rc = security_context_to_sid_default(context, rc, &sid,
+			rc = security_context_to_sid_default(current_selinux_ns,
+							     context, rc, &sid,
 							     sbsec->def_sid,
 							     GFP_NOFS);
 			if (rc) {
@@ -1621,7 +1649,8 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
 		sid = sbsec->sid;
 
 		/* Try to obtain a transition SID. */
-		rc = security_transition_sid(task_sid, sid, sclass, NULL, &sid);
+		rc = security_transition_sid(current_selinux_ns, task_sid, sid,
+					     sclass, NULL, &sid);
 		if (rc)
 			goto out;
 		break;
@@ -1872,7 +1901,8 @@ selinux_determine_inode_label(const struct task_security_struct *tsec,
 		*_new_isid = tsec->create_sid;
 	} else {
 		const struct inode_security_struct *dsec = inode_security(dir);
-		return security_transition_sid(tsec->sid, dsec->sid, tclass,
+		return security_transition_sid(current_selinux_ns, tsec->sid,
+					       dsec->sid, tclass,
 					       name, _new_isid);
 	}
 
@@ -2351,7 +2381,8 @@ static int check_nnp_nosuid(const struct linux_binprm *bprm,
 	 * i.e. SIDs that are guaranteed to only be allowed a subset
 	 * of the permissions of the current SID.
 	 */
-	rc = security_bounded_transition(old_tsec->sid, new_tsec->sid);
+	rc = security_bounded_transition(current_selinux_ns, old_tsec->sid,
+					 new_tsec->sid);
 	if (!rc)
 		return 0;
 
@@ -2403,8 +2434,8 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 			return rc;
 	} else {
 		/* Check for a default transition on this program. */
-		rc = security_transition_sid(old_tsec->sid, isec->sid,
-					     SECCLASS_PROCESS, NULL,
+		rc = security_transition_sid(current_selinux_ns, old_tsec->sid,
+					     isec->sid, SECCLASS_PROCESS, NULL,
 					     &new_tsec->sid);
 		if (rc)
 			return rc;
@@ -2762,7 +2793,9 @@ static int selinux_sb_remount(struct super_block *sb, void *data)
 
 		if (flags[i] == SBLABEL_MNT)
 			continue;
-		rc = security_context_str_to_sid(mount_options[i], &sid, GFP_KERNEL);
+		rc = security_context_str_to_sid(current_selinux_ns,
+						 mount_options[i], &sid,
+						 GFP_KERNEL);
 		if (rc) {
 			printk(KERN_WARNING "SELinux: security_context_str_to_sid"
 			       "(%s) failed for (dev %s, type %s) errno=%d\n",
@@ -2887,7 +2920,8 @@ static int selinux_dentry_init_security(struct dentry *dentry, int mode,
 	if (rc)
 		return rc;
 
-	return security_sid_to_context(newsid, (char **)ctx, ctxlen);
+	return security_sid_to_context(current_selinux_ns, newsid, (char **)ctx,
+				       ctxlen);
 }
 
 static int selinux_dentry_create_files_as(struct dentry *dentry, int mode,
@@ -2949,7 +2983,8 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 		*name = XATTR_SELINUX_SUFFIX;
 
 	if (value && len) {
-		rc = security_sid_to_context_force(newsid, &context, &clen);
+		rc = security_sid_to_context_force(current_selinux_ns, newsid,
+						   &context, &clen);
 		if (rc)
 			return rc;
 		*value = context;
@@ -3186,7 +3221,8 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	if (rc)
 		return rc;
 
-	rc = security_context_to_sid(value, size, &newsid, GFP_KERNEL);
+	rc = security_context_to_sid(current_selinux_ns, value, size, &newsid,
+				     GFP_KERNEL);
 	if (rc == -EINVAL) {
 		if (!has_cap_mac_admin(true)) {
 			struct audit_buffer *ab;
@@ -3212,7 +3248,8 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 
 			return rc;
 		}
-		rc = security_context_to_sid_force(value, size, &newsid);
+		rc = security_context_to_sid_force(current_selinux_ns, value,
+						   size, &newsid);
 	}
 	if (rc)
 		return rc;
@@ -3222,8 +3259,8 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	if (rc)
 		return rc;
 
-	rc = security_validate_transition(isec->sid, newsid, sid,
-					  isec->sclass);
+	rc = security_validate_transition(current_selinux_ns, isec->sid, newsid,
+					  sid, isec->sclass);
 	if (rc)
 		return rc;
 
@@ -3248,7 +3285,8 @@ static void selinux_inode_post_setxattr(struct dentry *dentry, const char *name,
 		return;
 	}
 
-	rc = security_context_to_sid_force(value, size, &newsid);
+	rc = security_context_to_sid_force(current_selinux_ns, value, size,
+					   &newsid);
 	if (rc) {
 		printk(KERN_ERR "SELinux:  unable to map context to SID"
 		       "for (%s, %lu), rc=%d\n",
@@ -3316,10 +3354,12 @@ static int selinux_inode_getsecurity(struct inode *inode, const char *name, void
 	 */
 	isec = inode_security(inode);
 	if (has_cap_mac_admin(false))
-		error = security_sid_to_context_force(isec->sid, &context,
+		error = security_sid_to_context_force(current_selinux_ns,
+						      isec->sid, &context,
 						      &size);
 	else
-		error = security_sid_to_context(isec->sid, &context, &size);
+		error = security_sid_to_context(current_selinux_ns, isec->sid,
+						&context, &size);
 	if (error)
 		return error;
 	error = size;
@@ -3345,7 +3385,8 @@ static int selinux_inode_setsecurity(struct inode *inode, const char *name,
 	if (!value || !size)
 		return -EACCES;
 
-	rc = security_context_to_sid(value, size, &newsid, GFP_KERNEL);
+	rc = security_context_to_sid(current_selinux_ns, value, size, &newsid,
+				     GFP_KERNEL);
 	if (rc)
 		return rc;
 
@@ -4279,7 +4320,8 @@ static int selinux_skb_peerlbl_sid(struct sk_buff *skb, u16 family, u32 *sid)
 	if (unlikely(err))
 		return -EACCES;
 
-	err = security_net_peersid_resolve(nlbl_sid, nlbl_type, xfrm_sid, sid);
+	err = security_net_peersid_resolve(current_selinux_ns, nlbl_sid,
+					   nlbl_type, xfrm_sid, sid);
 	if (unlikely(err)) {
 		printk(KERN_WARNING
 		       "SELinux: failure in selinux_skb_peerlbl_sid(),"
@@ -4307,7 +4349,8 @@ static int selinux_conn_sid(u32 sk_sid, u32 skb_sid, u32 *conn_sid)
 	int err = 0;
 
 	if (skb_sid != SECSID_NULL)
-		err = security_sid_mls_copy(sk_sid, skb_sid, conn_sid);
+		err = security_sid_mls_copy(current_selinux_ns, sk_sid, skb_sid,
+					    conn_sid);
 	else
 		*conn_sid = sk_sid;
 
@@ -4324,8 +4367,8 @@ static int socket_sockcreate_sid(const struct task_security_struct *tsec,
 		return 0;
 	}
 
-	return security_transition_sid(tsec->sid, tsec->sid, secclass, NULL,
-				       socksid);
+	return security_transition_sid(current_selinux_ns, tsec->sid, tsec->sid,
+				       secclass, NULL, socksid);
 }
 
 static int sock_has_perm(struct sock *sk, u32 perms)
@@ -4660,8 +4703,8 @@ static int selinux_socket_unix_stream_connect(struct sock *sock,
 
 	/* server child socket */
 	sksec_new->peer_sid = sksec_sock->sid;
-	err = security_sid_mls_copy(sksec_other->sid, sksec_sock->sid,
-				    &sksec_new->sid);
+	err = security_sid_mls_copy(current_selinux_ns, sksec_other->sid,
+				    sksec_sock->sid, &sksec_new->sid);
 	if (err)
 		return err;
 
@@ -4827,7 +4870,8 @@ static int selinux_socket_getpeersec_stream(struct socket *sock, char __user *op
 	if (peer_sid == SECSID_NULL)
 		return -ENOPROTOOPT;
 
-	err = security_sid_to_context(peer_sid, &scontext, &scontext_len);
+	err = security_sid_to_context(current_selinux_ns, peer_sid, &scontext,
+				      &scontext_len);
 	if (err)
 		return err;
 
@@ -5112,7 +5156,8 @@ static int selinux_nlmsg_perm(struct sock *sk, struct sk_buff *skb)
 			       sk->sk_protocol, nlh->nlmsg_type,
 			       secclass_map[sksec->sclass - 1].name,
 			       task_pid_nr(current), current->comm);
-			if (!selinux_enforcing || security_get_allow_unknown())
+			if (!selinux_enforcing ||
+			    security_get_allow_unknown(current_selinux_ns))
 				err = 0;
 		}
 
@@ -5617,8 +5662,8 @@ static int selinux_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg,
 		 * Compute new sid based on current process and
 		 * message queue this message will be stored in
 		 */
-		rc = security_transition_sid(sid, isec->sid, SECCLASS_MSG,
-					     NULL, &msec->sid);
+		rc = security_transition_sid(current_selinux_ns, sid, isec->sid,
+					     SECCLASS_MSG, NULL, &msec->sid);
 		if (rc)
 			return rc;
 	}
@@ -5927,7 +5972,7 @@ static int selinux_getprocattr(struct task_struct *p,
 	if (!sid)
 		return 0;
 
-	error = security_sid_to_context(sid, value, &len);
+	error = security_sid_to_context(current_selinux_ns, sid, value, &len);
 	if (error)
 		return error;
 	return len;
@@ -5974,7 +6019,8 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 			str[size-1] = 0;
 			size--;
 		}
-		error = security_context_to_sid(value, size, &sid, GFP_KERNEL);
+		error = security_context_to_sid(current_selinux_ns, value, size,
+						&sid, GFP_KERNEL);
 		if (error == -EINVAL && !strcmp(name, "fscreate")) {
 			if (!has_cap_mac_admin(true)) {
 				struct audit_buffer *ab;
@@ -5993,8 +6039,9 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 
 				return error;
 			}
-			error = security_context_to_sid_force(value, size,
-							      &sid);
+			error = security_context_to_sid_force(
+						      current_selinux_ns,
+						      value, size, &sid);
 		}
 		if (error)
 			return error;
@@ -6031,7 +6078,8 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 		/* Only allow single threaded processes to change context */
 		error = -EPERM;
 		if (!current_is_single_threaded()) {
-			error = security_bounded_transition(tsec->sid, sid);
+			error = security_bounded_transition(current_selinux_ns,
+							    tsec->sid, sid);
 			if (error)
 				goto abort_change;
 		}
@@ -6073,12 +6121,14 @@ static int selinux_ismaclabel(const char *name)
 
 static int selinux_secid_to_secctx(u32 secid, char **secdata, u32 *seclen)
 {
-	return security_sid_to_context(secid, secdata, seclen);
+	return security_sid_to_context(current_selinux_ns, secid,
+				       secdata, seclen);
 }
 
 static int selinux_secctx_to_secid(const char *secdata, u32 seclen, u32 *secid)
 {
-	return security_context_to_sid(secdata, seclen, secid, GFP_KERNEL);
+	return security_context_to_sid(current_selinux_ns, secdata, seclen,
+				       secid, GFP_KERNEL);
 }
 
 static void selinux_release_secctx(char *secdata, u32 seclen)
@@ -6180,7 +6230,8 @@ static int selinux_key_getsecurity(struct key *key, char **_buffer)
 	unsigned len;
 	int rc;
 
-	rc = security_sid_to_context(ksec->sid, &context, &len);
+	rc = security_sid_to_context(current_selinux_ns, ksec->sid,
+				     &context, &len);
 	if (!rc)
 		rc = len;
 	*_buffer = context;
@@ -6219,7 +6270,8 @@ static int selinux_ib_endport_manage_subnet(void *ib_sec, const char *dev_name,
 	struct ib_security_struct *sec = ib_sec;
 	struct lsm_ibendport_audit ibendport;
 
-	err = security_ib_endport_sid(dev_name, port_num, &sid);
+	err = security_ib_endport_sid(current_selinux_ns, dev_name, port_num,
+				      &sid);
 
 	if (err)
 		return err;
@@ -6473,6 +6525,52 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 #endif
 };
 
+static void selinux_ns_free(struct work_struct *work);
+
+int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns)
+{
+	struct selinux_ns *newns;
+	int rc;
+
+	newns = kzalloc(sizeof(*newns), GFP_KERNEL);
+	if (!newns)
+		return -ENOMEM;
+
+	refcount_set(&newns->count, 1);
+	INIT_WORK(&newns->work, selinux_ns_free);
+
+	rc = selinux_ss_create(&newns->ss);
+	if (rc)
+		goto err;
+
+	if (parent)
+		newns->parent = get_selinux_ns(parent);
+
+	*ns = newns;
+	return 0;
+err:
+	kfree(newns);
+	return rc;
+}
+
+static void selinux_ns_free(struct work_struct *work)
+{
+	struct selinux_ns *parent, *ns =
+		container_of(work, struct selinux_ns, work);
+
+	do {
+		parent = ns->parent;
+		selinux_ss_free(ns->ss);
+		kfree(ns);
+		ns = parent;
+	} while (ns && refcount_dec_and_test(&ns->count));
+}
+
+void __put_selinux_ns(struct selinux_ns *ns)
+{
+	schedule_work(&ns->work);
+}
+
 static __init int selinux_init(void)
 {
 	if (!security_module_enable("selinux")) {
@@ -6487,6 +6585,11 @@ static __init int selinux_init(void)
 
 	printk(KERN_INFO "SELinux:  Initializing.\n");
 
+	if (selinux_ns_create(NULL, &init_selinux_ns))
+		panic("SELinux: Could not create initial namespace\n");
+
+	set_ns_enforcing(init_selinux_ns, selinux_enforcing_boot);
+
 	/* Set the security state for the initial task. */
 	cred_init_security();
 
@@ -6500,6 +6603,12 @@ static __init int selinux_init(void)
 					    0, SLAB_PANIC, NULL);
 	avc_init();
 
+	avtab_cache_init();
+
+	ebitmap_cache_init();
+
+	hashtab_cache_init();
+
 	security_add_hooks(selinux_hooks, ARRAY_SIZE(selinux_hooks), "selinux");
 
 	if (avc_add_callback(selinux_netcache_avc_callback, AVC_CALLBACK_RESET))
@@ -6508,7 +6617,7 @@ static __init int selinux_init(void)
 	if (avc_add_callback(selinux_lsm_notifier_avc_callback, AVC_CALLBACK_RESET))
 		panic("SELinux: Unable to register AVC LSM notifier callback\n");
 
-	if (selinux_enforcing)
+	if (selinux_enforcing_boot)
 		printk(KERN_DEBUG "SELinux:  Starting in enforcing mode\n");
 	else
 		printk(KERN_DEBUG "SELinux:  Starting in permissive mode\n");
@@ -6629,23 +6738,32 @@ static void selinux_nf_ip_exit(void)
 #endif /* CONFIG_NETFILTER */
 
 #ifdef CONFIG_SECURITY_SELINUX_DISABLE
-static int selinux_disabled;
-
-int selinux_disable(void)
+int selinux_disable(struct selinux_ns *ns)
 {
-	if (ss_initialized) {
+	if (ns->initialized) {
 		/* Not permitted after initial policy load. */
 		return -EINVAL;
 	}
 
-	if (selinux_disabled) {
+	if (ns->disabled) {
 		/* Only do this once. */
 		return -EINVAL;
 	}
 
+	ns->disabled = 1;
+
+	/*
+	 * Disable of a non-init ns does not disable SELinux in the host.
+	 * We simply let the disable succeed, and init will then
+	 * unmount its selinuxfs instance and subsequent userspace
+	 * within the ns will interpret the absence of a selinuxfs mount
+	 * as SELinux being disabled.
+	 */
+	if (ns != init_selinux_ns)
+		return 0;
+
 	printk(KERN_INFO "SELinux:  Disabled at runtime.\n");
 
-	selinux_disabled = 1;
 	selinux_enabled = 0;
 
 	security_delete_hooks(selinux_hooks, ARRAY_SIZE(selinux_hooks));
diff --git a/security/selinux/ibpkey.c b/security/selinux/ibpkey.c
index e3614ee..1bbc636 100644
--- a/security/selinux/ibpkey.c
+++ b/security/selinux/ibpkey.c
@@ -152,7 +152,8 @@ static int sel_ib_pkey_sid_slow(u64 subnet_prefix, u16 pkey_num, u32 *sid)
 		return 0;
 	}
 
-	ret = security_ib_pkey_sid(subnet_prefix, pkey_num, sid);
+	ret = security_ib_pkey_sid(current_selinux_ns, subnet_prefix, pkey_num,
+				   sid);
 	if (ret)
 		goto out;
 
diff --git a/security/selinux/include/avc.h b/security/selinux/include/avc.h
index a5004e9..8fd09f7 100644
--- a/security/selinux/include/avc.h
+++ b/security/selinux/include/avc.h
@@ -19,12 +19,6 @@
 #include "av_permissions.h"
 #include "security.h"
 
-#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
-extern int selinux_enforcing;
-#else
-#define selinux_enforcing 1
-#endif
-
 /*
  * An entry in the AVC.
  */
diff --git a/security/selinux/include/avc_ss.h b/security/selinux/include/avc_ss.h
index 37d57da..7fef2fd 100644
--- a/security/selinux/include/avc_ss.h
+++ b/security/selinux/include/avc_ss.h
@@ -18,11 +18,5 @@ struct security_class_mapping {
 
 extern struct security_class_mapping secclass_map[];
 
-/*
- * The security server must be initialized before
- * any labeling or access decisions can be provided.
- */
-extern int ss_initialized;
-
 #endif /* _SELINUX_AVC_SS_H_ */
 
diff --git a/security/selinux/include/conditional.h b/security/selinux/include/conditional.h
index ff4fddc..13ffb38 100644
--- a/security/selinux/include/conditional.h
+++ b/security/selinux/include/conditional.h
@@ -13,10 +13,15 @@
 #ifndef _SELINUX_CONDITIONAL_H_
 #define _SELINUX_CONDITIONAL_H_
 
-int security_get_bools(int *len, char ***names, int **values);
+#include "security.h"
 
-int security_set_bools(int len, int *values);
+int security_get_bools(struct selinux_ns *ns,
+		       int *len, char ***names, int **values);
 
-int security_get_bool_value(int index);
+int security_set_bools(struct selinux_ns *ns,
+		       int len, int *values);
+
+int security_get_bool_value(struct selinux_ns *ns,
+			    int index);
 
 #endif
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 1649cd1..42d2dbb 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -150,6 +150,4 @@ struct pkey_security_struct {
 	u32	sid;	/* SID of pkey */
 };
 
-extern unsigned int selinux_checkreqprot;
-
 #endif /* _SELINUX_OBJSEC_H_ */
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 28dfb2f..b70d1dd 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -12,6 +12,8 @@
 #include <linux/dcache.h>
 #include <linux/magic.h>
 #include <linux/types.h>
+#include <linux/refcount.h>
+#include <linux/workqueue.h>
 #include "flask.h"
 
 #define SECSID_NULL			0x00000000 /* unspecified SID */
@@ -80,13 +82,6 @@ enum {
 
 extern char *selinux_policycap_names[__POLICYDB_CAPABILITY_MAX];
 
-extern int selinux_policycap_netpeer;
-extern int selinux_policycap_openperm;
-extern int selinux_policycap_extsockclass;
-extern int selinux_policycap_alwaysnetwork;
-extern int selinux_policycap_cgroupseclabel;
-extern int selinux_policycap_nnp_nosuid_transition;
-
 /*
  * type_datum properties
  * available at the kernel policy version >= POLICYDB_VERSION_BOUNDARY
@@ -97,13 +92,80 @@ extern int selinux_policycap_nnp_nosuid_transition;
 /* limitation of boundary depth  */
 #define POLICYDB_BOUNDS_MAXDEPTH	4
 
-int security_mls_enabled(void);
+struct selinux_ss;
+
+struct selinux_ns {
+	refcount_t count;
+	struct work_struct work;
+	bool disabled;
+#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
+	bool enforcing;
+#endif
+	bool checkreqprot;
+	bool initialized;
+	bool policycap[__POLICYDB_CAPABILITY_MAX];
+	struct selinux_ss *ss;
+	struct selinux_ns *parent;
+};
+
+int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns);
+void __put_selinux_ns(struct selinux_ns *ns);
+
+int selinux_ss_create(struct selinux_ss **ss);
+void selinux_ss_free(struct selinux_ss *ss);
 
-int security_load_policy(void *data, size_t len);
-int security_read_policy(void **data, size_t *len);
-size_t security_policydb_len(void);
+static inline void put_selinux_ns(struct selinux_ns *ns)
+{
+	if (ns && refcount_dec_and_test(&ns->count))
+		__put_selinux_ns(ns);
+}
 
-int security_policycap_supported(unsigned int req_cap);
+static inline struct selinux_ns *get_selinux_ns(struct selinux_ns *ns)
+{
+	refcount_inc(&ns->count);
+	return ns;
+}
+
+extern struct selinux_ns *init_selinux_ns;
+
+#define current_selinux_ns (init_selinux_ns)
+
+#define ss_initialized (current_selinux_ns->initialized)
+
+#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
+#define selinux_enforcing (current_selinux_ns->enforcing)
+#define ns_enforcing(ns) ((ns)->enforcing)
+#define set_ns_enforcing(ns, value) ((ns)->enforcing = value)
+#else
+#define selinux_enforcing 1
+#define ns_enforcing(ns) 1
+#define set_ns_enforcing(ns, value)
+#endif
+
+#define selinux_checkreqprot (current_selinux_ns->checkreqprot)
+
+#define selinux_policycap_netpeer \
+	(current_selinux_ns->policycap[POLICYDB_CAPABILITY_NETPEER])
+#define selinux_policycap_openperm \
+	(current_selinux_ns->policycap[POLICYDB_CAPABILITY_OPENPERM])
+#define selinux_policycap_extsockclass \
+	(current_selinux_ns->policycap[POLICYDB_CAPABILITY_EXTSOCKCLASS])
+#define selinux_policycap_alwaysnetwork \
+	(current_selinux_ns->policycap[POLICYDB_CAPABILITY_ALWAYSNETWORK])
+#define selinux_policycap_cgroupseclabel \
+	(current_selinux_ns->policycap[POLICYDB_CAPABILITY_CGROUPSECLABEL])
+#define selinux_policycap_nnp_nosuid_transition \
+	(current_selinux_ns->policycap[POLICYDB_CAPABILITY_NNP_NOSUID_TRANSITION])
+
+int security_mls_enabled(struct selinux_ns *ns);
+int security_load_policy(struct selinux_ns *ns,
+			 void *data, size_t len);
+int security_read_policy(struct selinux_ns *ns,
+			 void **data, size_t *len);
+size_t security_policydb_len(struct selinux_ns *ns);
+
+int security_policycap_supported(struct selinux_ns *ns,
+				 unsigned int req_cap);
 
 #define SEL_VEC_MAX 32
 struct av_decision {
@@ -140,76 +202,100 @@ struct extended_perms {
 /* definitions of av_decision.flags */
 #define AVD_FLAGS_PERMISSIVE	0x0001
 
-void security_compute_av(u32 ssid, u32 tsid,
+void security_compute_av(struct selinux_ns *ns,
+			 u32 ssid, u32 tsid,
 			 u16 tclass, struct av_decision *avd,
 			 struct extended_perms *xperms);
 
-void security_compute_xperms_decision(u32 ssid, u32 tsid, u16 tclass,
-			 u8 driver, struct extended_perms_decision *xpermd);
+void security_compute_xperms_decision(struct selinux_ns *ns,
+				      u32 ssid, u32 tsid, u16 tclass,
+				      u8 driver,
+				      struct extended_perms_decision *xpermd);
 
-void security_compute_av_user(u32 ssid, u32 tsid,
-			     u16 tclass, struct av_decision *avd);
+void security_compute_av_user(struct selinux_ns *ns,
+			      u32 ssid, u32 tsid,
+			      u16 tclass, struct av_decision *avd);
 
-int security_transition_sid(u32 ssid, u32 tsid, u16 tclass,
+int security_transition_sid(struct selinux_ns *ns,
+			    u32 ssid, u32 tsid, u16 tclass,
 			    const struct qstr *qstr, u32 *out_sid);
 
-int security_transition_sid_user(u32 ssid, u32 tsid, u16 tclass,
+int security_transition_sid_user(struct selinux_ns *ns,
+				 u32 ssid, u32 tsid, u16 tclass,
 				 const char *objname, u32 *out_sid);
 
-int security_member_sid(u32 ssid, u32 tsid,
-	u16 tclass, u32 *out_sid);
+int security_member_sid(struct selinux_ns *ns, u32 ssid, u32 tsid,
+			u16 tclass, u32 *out_sid);
 
-int security_change_sid(u32 ssid, u32 tsid,
-	u16 tclass, u32 *out_sid);
+int security_change_sid(struct selinux_ns *ns, u32 ssid, u32 tsid,
+			u16 tclass, u32 *out_sid);
 
-int security_sid_to_context(u32 sid, char **scontext,
-	u32 *scontext_len);
+int security_sid_to_context(struct selinux_ns *ns, u32 sid,
+			    char **scontext, u32 *scontext_len);
 
-int security_sid_to_context_force(u32 sid, char **scontext, u32 *scontext_len);
+int security_sid_to_context_force(struct selinux_ns *ns,
+				  u32 sid, char **scontext, u32 *scontext_len);
 
-int security_context_to_sid(const char *scontext, u32 scontext_len,
+int security_context_to_sid(struct selinux_ns *ns,
+			    const char *scontext, u32 scontext_len,
 			    u32 *out_sid, gfp_t gfp);
 
-int security_context_str_to_sid(const char *scontext, u32 *out_sid, gfp_t gfp);
+int security_context_str_to_sid(struct selinux_ns *ns,
+				const char *scontext, u32 *out_sid, gfp_t gfp);
 
-int security_context_to_sid_default(const char *scontext, u32 scontext_len,
+int security_context_to_sid_default(struct selinux_ns *ns,
+				    const char *scontext, u32 scontext_len,
 				    u32 *out_sid, u32 def_sid, gfp_t gfp_flags);
 
-int security_context_to_sid_force(const char *scontext, u32 scontext_len,
+int security_context_to_sid_force(struct selinux_ns *ns,
+				  const char *scontext, u32 scontext_len,
 				  u32 *sid);
 
-int security_get_user_sids(u32 callsid, char *username,
+int security_get_user_sids(struct selinux_ns *ns,
+			   u32 callsid, char *username,
 			   u32 **sids, u32 *nel);
 
-int security_port_sid(u8 protocol, u16 port, u32 *out_sid);
+int security_port_sid(struct selinux_ns *ns,
+		      u8 protocol, u16 port, u32 *out_sid);
 
-int security_ib_pkey_sid(u64 subnet_prefix, u16 pkey_num, u32 *out_sid);
+int security_ib_pkey_sid(struct selinux_ns *ns,
+			 u64 subnet_prefix, u16 pkey_num, u32 *out_sid);
 
-int security_ib_endport_sid(const char *dev_name, u8 port_num, u32 *out_sid);
+int security_ib_endport_sid(struct selinux_ns *ns,
+			    const char *dev_name, u8 port_num, u32 *out_sid);
 
-int security_netif_sid(char *name, u32 *if_sid);
+int security_netif_sid(struct selinux_ns *ns,
+		       char *name, u32 *if_sid);
 
-int security_node_sid(u16 domain, void *addr, u32 addrlen,
-	u32 *out_sid);
+int security_node_sid(struct selinux_ns *ns,
+		      u16 domain, void *addr, u32 addrlen,
+		      u32 *out_sid);
 
-int security_validate_transition(u32 oldsid, u32 newsid, u32 tasksid,
+int security_validate_transition(struct selinux_ns *ns,
+				 u32 oldsid, u32 newsid, u32 tasksid,
 				 u16 tclass);
 
-int security_validate_transition_user(u32 oldsid, u32 newsid, u32 tasksid,
+int security_validate_transition_user(struct selinux_ns *ns,
+				      u32 oldsid, u32 newsid, u32 tasksid,
 				      u16 tclass);
 
-int security_bounded_transition(u32 oldsid, u32 newsid);
+int security_bounded_transition(struct selinux_ns *ns,
+				u32 oldsid, u32 newsid);
 
-int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid);
+int security_sid_mls_copy(struct selinux_ns *ns,
+			  u32 sid, u32 mls_sid, u32 *new_sid);
 
-int security_net_peersid_resolve(u32 nlbl_sid, u32 nlbl_type,
+int security_net_peersid_resolve(struct selinux_ns *ns,
+				 u32 nlbl_sid, u32 nlbl_type,
 				 u32 xfrm_sid,
 				 u32 *peer_sid);
 
-int security_get_classes(char ***classes, int *nclasses);
-int security_get_permissions(char *class, char ***perms, int *nperms);
-int security_get_reject_unknown(void);
-int security_get_allow_unknown(void);
+int security_get_classes(struct selinux_ns *ns,
+			 char ***classes, int *nclasses);
+int security_get_permissions(struct selinux_ns *ns,
+			     char *class, char ***perms, int *nperms);
+int security_get_reject_unknown(struct selinux_ns *ns);
+int security_get_allow_unknown(struct selinux_ns *ns);
 
 #define SECURITY_FS_USE_XATTR		1 /* use xattr */
 #define SECURITY_FS_USE_TRANS		2 /* use transition SIDs, e.g. devpts/tmpfs */
@@ -220,27 +306,31 @@ int security_get_allow_unknown(void);
 #define SECURITY_FS_USE_NATIVE		7 /* use native label support */
 #define SECURITY_FS_USE_MAX		7 /* Highest SECURITY_FS_USE_XXX */
 
-int security_fs_use(struct super_block *sb);
+int security_fs_use(struct selinux_ns *ns, struct super_block *sb);
 
-int security_genfs_sid(const char *fstype, char *name, u16 sclass,
-	u32 *sid);
+int security_genfs_sid(struct selinux_ns *ns,
+		       const char *fstype, char *name, u16 sclass,
+		       u32 *sid);
 
 #ifdef CONFIG_NETLABEL
-int security_netlbl_secattr_to_sid(struct netlbl_lsm_secattr *secattr,
+int security_netlbl_secattr_to_sid(struct selinux_ns *ns,
+				   struct netlbl_lsm_secattr *secattr,
 				   u32 *sid);
 
-int security_netlbl_sid_to_secattr(u32 sid,
+int security_netlbl_sid_to_secattr(struct selinux_ns *ns,
+				   u32 sid,
 				   struct netlbl_lsm_secattr *secattr);
 #else
-static inline int security_netlbl_secattr_to_sid(
+static inline int security_netlbl_secattr_to_sid(struct selinux_ns *ns,
 					    struct netlbl_lsm_secattr *secattr,
 					    u32 *sid)
 {
 	return -EIDRM;
 }
 
-static inline int security_netlbl_sid_to_secattr(u32 sid,
-					   struct netlbl_lsm_secattr *secattr)
+static inline int security_netlbl_sid_to_secattr(struct selinux_ns *ns,
+					 u32 sid,
+					 struct netlbl_lsm_secattr *secattr)
 {
 	return -ENOENT;
 }
@@ -251,7 +341,7 @@ const char *security_get_initial_sid_context(u32 sid);
 /*
  * status notifier using mmap interface
  */
-extern struct page *selinux_kernel_status_page(void);
+extern struct page *selinux_kernel_status_page(struct selinux_ns *ns);
 
 #define SELINUX_KERNEL_STATUS_VERSION	1
 struct selinux_kernel_status {
@@ -265,10 +355,12 @@ struct selinux_kernel_status {
 	 */
 } __packed;
 
-extern void selinux_status_update_setenforce(int enforcing);
-extern void selinux_status_update_policyload(int seqno);
+extern void selinux_status_update_setenforce(struct selinux_ns *ns,
+					     int enforcing);
+extern void selinux_status_update_policyload(struct selinux_ns *ns,
+					     int seqno);
 extern void selinux_complete_init(void);
-extern int selinux_disable(void);
+extern int selinux_disable(struct selinux_ns *ns);
 extern void exit_sel_fs(void);
 extern struct path selinux_null;
 extern struct vfsmount *selinuxfs_mount;
@@ -276,5 +368,8 @@ extern void selnl_notify_setenforce(int val);
 extern void selnl_notify_policyload(u32 seqno);
 extern int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm);
 
-#endif /* _SELINUX_SECURITY_H_ */
+extern void avtab_cache_init(void);
+extern void ebitmap_cache_init(void);
+extern void hashtab_cache_init(void);
 
+#endif /* _SELINUX_SECURITY_H_ */
diff --git a/security/selinux/netif.c b/security/selinux/netif.c
index e607b44..11a11e8 100644
--- a/security/selinux/netif.c
+++ b/security/selinux/netif.c
@@ -163,7 +163,7 @@ static int sel_netif_sid_slow(struct net *ns, int ifindex, u32 *sid)
 		ret = -ENOMEM;
 		goto out;
 	}
-	ret = security_netif_sid(dev->name, &new->nsec.sid);
+	ret = security_netif_sid(current_selinux_ns, dev->name, &new->nsec.sid);
 	if (ret != 0)
 		goto out;
 	new->nsec.ns = ns;
diff --git a/security/selinux/netlabel.c b/security/selinux/netlabel.c
index aaba667..b75ceaa 100644
--- a/security/selinux/netlabel.c
+++ b/security/selinux/netlabel.c
@@ -60,7 +60,7 @@ static int selinux_netlbl_sidlookup_cached(struct sk_buff *skb,
 {
 	int rc;
 
-	rc = security_netlbl_secattr_to_sid(secattr, sid);
+	rc = security_netlbl_secattr_to_sid(current_selinux_ns, secattr, sid);
 	if (rc == 0 &&
 	    (secattr->flags & NETLBL_SECATTR_CACHEABLE) &&
 	    (secattr->flags & NETLBL_SECATTR_CACHE))
@@ -91,7 +91,8 @@ static struct netlbl_lsm_secattr *selinux_netlbl_sock_genattr(struct sock *sk)
 	secattr = netlbl_secattr_alloc(GFP_ATOMIC);
 	if (secattr == NULL)
 		return NULL;
-	rc = security_netlbl_sid_to_secattr(sksec->sid, secattr);
+	rc = security_netlbl_sid_to_secattr(current_selinux_ns, sksec->sid,
+					    secattr);
 	if (rc != 0) {
 		netlbl_secattr_free(secattr);
 		return NULL;
@@ -257,7 +258,8 @@ int selinux_netlbl_skbuff_setsid(struct sk_buff *skb,
 	if (secattr == NULL) {
 		secattr = &secattr_storage;
 		netlbl_secattr_init(secattr);
-		rc = security_netlbl_sid_to_secattr(sid, secattr);
+		rc = security_netlbl_sid_to_secattr(current_selinux_ns, sid,
+						    secattr);
 		if (rc != 0)
 			goto skbuff_setsid_return;
 	}
@@ -290,7 +292,8 @@ int selinux_netlbl_inet_conn_request(struct request_sock *req, u16 family)
 		return 0;
 
 	netlbl_secattr_init(&secattr);
-	rc = security_netlbl_sid_to_secattr(req->secid, &secattr);
+	rc = security_netlbl_sid_to_secattr(current_selinux_ns, req->secid,
+					    &secattr);
 	if (rc != 0)
 		goto inet_conn_request_return;
 	rc = netlbl_req_setattr(req, &secattr);
diff --git a/security/selinux/netnode.c b/security/selinux/netnode.c
index da923f8..fdfd0bc 100644
--- a/security/selinux/netnode.c
+++ b/security/selinux/netnode.c
@@ -215,12 +215,12 @@ static int sel_netnode_sid_slow(void *addr, u16 family, u32 *sid)
 		goto out;
 	switch (family) {
 	case PF_INET:
-		ret = security_node_sid(PF_INET,
+		ret = security_node_sid(current_selinux_ns, PF_INET,
 					addr, sizeof(struct in_addr), sid);
 		new->nsec.addr.ipv4 = *(__be32 *)addr;
 		break;
 	case PF_INET6:
-		ret = security_node_sid(PF_INET6,
+		ret = security_node_sid(current_selinux_ns, PF_INET6,
 					addr, sizeof(struct in6_addr), sid);
 		new->nsec.addr.ipv6 = *(struct in6_addr *)addr;
 		break;
diff --git a/security/selinux/netport.c b/security/selinux/netport.c
index 3311cc3..7b7f745 100644
--- a/security/selinux/netport.c
+++ b/security/selinux/netport.c
@@ -161,7 +161,7 @@ static int sel_netport_sid_slow(u8 protocol, u16 pnum, u32 *sid)
 	new = kzalloc(sizeof(*new), GFP_ATOMIC);
 	if (new == NULL)
 		goto out;
-	ret = security_port_sid(protocol, pnum, sid);
+	ret = security_port_sid(current_selinux_ns, protocol, pnum, sid);
 	if (ret != 0)
 		goto out;
 
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 00eed84..07f2f8e 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -41,17 +41,6 @@
 #include "objsec.h"
 #include "conditional.h"
 
-unsigned int selinux_checkreqprot = CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE;
-
-static int __init checkreqprot_setup(char *str)
-{
-	unsigned long checkreqprot;
-	if (!kstrtoul(str, 0, &checkreqprot))
-		selinux_checkreqprot = checkreqprot ? 1 : 0;
-	return 1;
-}
-__setup("checkreqprot=", checkreqprot_setup);
-
 static DEFINE_MUTEX(sel_mutex);
 
 /* global data for booleans */
@@ -153,7 +142,8 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 		if (selinux_enforcing)
 			avc_ss_reset(0);
 		selnl_notify_setenforce(selinux_enforcing);
-		selinux_status_update_setenforce(selinux_enforcing);
+		selinux_status_update_setenforce(current_selinux_ns,
+						 selinux_enforcing);
 		if (!selinux_enforcing)
 			call_lsm_notifier(LSM_POLICY_CHANGE, NULL);
 	}
@@ -179,7 +169,8 @@ static ssize_t sel_read_handle_unknown(struct file *filp, char __user *buf,
 	ssize_t length;
 	ino_t ino = file_inode(filp)->i_ino;
 	int handle_unknown = (ino == SEL_REJECT_UNKNOWN) ?
-		security_get_reject_unknown() : !security_get_allow_unknown();
+		security_get_reject_unknown(current_selinux_ns) :
+		!security_get_allow_unknown(current_selinux_ns);
 
 	length = scnprintf(tmpbuf, TMPBUFLEN, "%d", handle_unknown);
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
@@ -192,7 +183,7 @@ static const struct file_operations sel_handle_unknown_ops = {
 
 static int sel_open_handle_status(struct inode *inode, struct file *filp)
 {
-	struct page    *status = selinux_kernel_status_page();
+	struct page    *status = selinux_kernel_status_page(current_selinux_ns);
 
 	if (!status)
 		return -ENOMEM;
@@ -268,7 +259,7 @@ static ssize_t sel_write_disable(struct file *file, const char __user *buf,
 		goto out;
 
 	if (new_value) {
-		length = selinux_disable();
+		length = selinux_disable(current_selinux_ns);
 		if (length)
 			goto out;
 		audit_log(current->audit_context, GFP_KERNEL, AUDIT_MAC_STATUS,
@@ -322,7 +313,7 @@ static ssize_t sel_read_mls(struct file *filp, char __user *buf,
 	ssize_t length;
 
 	length = scnprintf(tmpbuf, TMPBUFLEN, "%d",
-			   security_mls_enabled());
+			   security_mls_enabled(current_selinux_ns));
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
@@ -359,13 +350,13 @@ static int sel_open_policy(struct inode *inode, struct file *filp)
 	if (!plm)
 		goto err;
 
-	if (i_size_read(inode) != security_policydb_len()) {
+	if (i_size_read(inode) != security_policydb_len(current_selinux_ns)) {
 		inode_lock(inode);
-		i_size_write(inode, security_policydb_len());
+		i_size_write(inode, security_policydb_len(current_selinux_ns));
 		inode_unlock(inode);
 	}
 
-	rc = security_read_policy(&plm->data, &plm->len);
+	rc = security_read_policy(current_selinux_ns, &plm->data, &plm->len);
 	if (rc)
 		goto err;
 
@@ -500,7 +491,7 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf,
 	if (copy_from_user(data, buf, count) != 0)
 		goto out;
 
-	length = security_load_policy(data, count);
+	length = security_load_policy(current_selinux_ns, data, count);
 	if (length) {
 		pr_warn_ratelimited("SELinux: failed to load policy\n");
 		goto out;
@@ -553,11 +544,12 @@ static ssize_t sel_write_context(struct file *file, char *buf, size_t size)
 	if (length)
 		goto out;
 
-	length = security_context_to_sid(buf, size, &sid, GFP_KERNEL);
+	length = security_context_to_sid(current_selinux_ns, buf, size,
+					 &sid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(sid, &canon, &len);
+	length = security_sid_to_context(current_selinux_ns, sid, &canon, &len);
 	if (length)
 		goto out;
 
@@ -673,19 +665,23 @@ static ssize_t sel_write_validatetrans(struct file *file,
 	if (sscanf(req, "%s %s %hu %s", oldcon, newcon, &tclass, taskcon) != 4)
 		goto out;
 
-	rc = security_context_str_to_sid(oldcon, &osid, GFP_KERNEL);
+	rc = security_context_str_to_sid(current_selinux_ns, oldcon, &osid,
+					 GFP_KERNEL);
 	if (rc)
 		goto out;
 
-	rc = security_context_str_to_sid(newcon, &nsid, GFP_KERNEL);
+	rc = security_context_str_to_sid(current_selinux_ns, newcon, &nsid,
+					 GFP_KERNEL);
 	if (rc)
 		goto out;
 
-	rc = security_context_str_to_sid(taskcon, &tsid, GFP_KERNEL);
+	rc = security_context_str_to_sid(current_selinux_ns, taskcon, &tsid,
+					 GFP_KERNEL);
 	if (rc)
 		goto out;
 
-	rc = security_validate_transition_user(osid, nsid, tsid, tclass);
+	rc = security_validate_transition_user(current_selinux_ns, osid, nsid,
+					       tsid, tclass);
 	if (!rc)
 		rc = count;
 out:
@@ -780,15 +776,17 @@ static ssize_t sel_write_access(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s %hu", scon, tcon, &tclass) != 3)
 		goto out;
 
-	length = security_context_str_to_sid(scon, &ssid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(tcon, &tsid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	security_compute_av_user(ssid, tsid, tclass, &avd);
+	security_compute_av_user(current_selinux_ns, ssid, tsid, tclass, &avd);
 
 	length = scnprintf(buf, SIMPLE_TRANSACTION_LIMIT,
 			  "%x %x %x %x %u %x",
@@ -868,20 +866,23 @@ static ssize_t sel_write_create(struct file *file, char *buf, size_t size)
 		objname = namebuf;
 	}
 
-	length = security_context_str_to_sid(scon, &ssid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(tcon, &tsid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_transition_sid_user(ssid, tsid, tclass,
-					      objname, &newsid);
+	length = security_transition_sid_user(current_selinux_ns, ssid, tsid,
+					      tclass, objname, &newsid);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(newsid, &newcon, &len);
+	length = security_sid_to_context(current_selinux_ns, newsid, &newcon,
+					 &len);
 	if (length)
 		goto out;
 
@@ -931,19 +932,23 @@ static ssize_t sel_write_relabel(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s %hu", scon, tcon, &tclass) != 3)
 		goto out;
 
-	length = security_context_str_to_sid(scon, &ssid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(tcon, &tsid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_change_sid(ssid, tsid, tclass, &newsid);
+	length = security_change_sid(current_selinux_ns, ssid, tsid, tclass,
+				     &newsid);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(newsid, &newcon, &len);
+	length = security_sid_to_context(current_selinux_ns, newsid, &newcon,
+					 &len);
 	if (length)
 		goto out;
 
@@ -989,18 +994,21 @@ static ssize_t sel_write_user(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s", con, user) != 2)
 		goto out;
 
-	length = security_context_str_to_sid(con, &sid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, con, &sid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_get_user_sids(sid, user, &sids, &nsids);
+	length = security_get_user_sids(current_selinux_ns, sid, user, &sids,
+					&nsids);
 	if (length)
 		goto out;
 
 	length = sprintf(buf, "%u", nsids) + 1;
 	ptr = buf + length;
 	for (i = 0; i < nsids; i++) {
-		rc = security_sid_to_context(sids[i], &newcon, &len);
+		rc = security_sid_to_context(current_selinux_ns, sids[i],
+					     &newcon, &len);
 		if (rc) {
 			length = rc;
 			goto out;
@@ -1051,19 +1059,23 @@ static ssize_t sel_write_member(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s %hu", scon, tcon, &tclass) != 3)
 		goto out;
 
-	length = security_context_str_to_sid(scon, &ssid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(tcon, &tsid, GFP_KERNEL);
+	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
+					     GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_member_sid(ssid, tsid, tclass, &newsid);
+	length = security_member_sid(current_selinux_ns, ssid, tsid, tclass,
+				     &newsid);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(newsid, &newcon, &len);
+	length = security_sid_to_context(current_selinux_ns, newsid, &newcon,
+					 &len);
 	if (length)
 		goto out;
 
@@ -1115,7 +1127,7 @@ static ssize_t sel_read_bool(struct file *filep, char __user *buf,
 	if (!page)
 		goto out;
 
-	cur_enforcing = security_get_bool_value(index);
+	cur_enforcing = security_get_bool_value(current_selinux_ns, index);
 	if (cur_enforcing < 0) {
 		ret = cur_enforcing;
 		goto out;
@@ -1226,7 +1238,8 @@ static ssize_t sel_commit_bools_write(struct file *filep,
 
 	length = 0;
 	if (new_value && bool_pending_values)
-		length = security_set_bools(bool_num, bool_pending_values);
+		length = security_set_bools(current_selinux_ns, bool_num,
+					    bool_pending_values);
 
 	if (!length)
 		length = count;
@@ -1279,7 +1292,7 @@ static int sel_make_bools(void)
 	if (!page)
 		goto out;
 
-	ret = security_get_bools(&num, &names, &values);
+	ret = security_get_bools(current_selinux_ns, &num, &names, &values);
 	if (ret)
 		goto out;
 
@@ -1300,7 +1313,8 @@ static int sel_make_bools(void)
 			goto out;
 
 		isec = (struct inode_security_struct *)inode->i_security;
-		ret = security_genfs_sid("selinuxfs", page, SECCLASS_FILE, &sid);
+		ret = security_genfs_sid(current_selinux_ns, "selinuxfs", page,
+					 SECCLASS_FILE, &sid);
 		if (ret) {
 			pr_warn_ratelimited("SELinux: no sid found, defaulting to security isid for %s\n",
 					   page);
@@ -1524,7 +1538,7 @@ static ssize_t sel_read_initcon(struct file *file, char __user *buf,
 	ssize_t ret;
 
 	sid = file_inode(file)->i_ino&SEL_INO_MASK;
-	ret = security_sid_to_context(sid, &con, &len);
+	ret = security_sid_to_context(current_selinux_ns, sid, &con, &len);
 	if (ret)
 		return ret;
 
@@ -1617,7 +1631,8 @@ static ssize_t sel_read_policycap(struct file *file, char __user *buf,
 	ssize_t length;
 	unsigned long i_ino = file_inode(file)->i_ino;
 
-	value = security_policycap_supported(i_ino & SEL_INO_MASK);
+	value = security_policycap_supported(current_selinux_ns,
+					     i_ino & SEL_INO_MASK);
 	length = scnprintf(tmpbuf, TMPBUFLEN, "%d", value);
 
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
@@ -1634,7 +1649,8 @@ static int sel_make_perm_files(char *objclass, int classvalue,
 	int i, rc, nperms;
 	char **perms;
 
-	rc = security_get_permissions(objclass, &perms, &nperms);
+	rc = security_get_permissions(current_selinux_ns, objclass, &perms,
+				      &nperms);
 	if (rc)
 		return rc;
 
@@ -1701,7 +1717,7 @@ static int sel_make_classes(void)
 	/* delete any existing entries */
 	sel_remove_entries(class_dir);
 
-	rc = security_get_classes(&classes, &nclasses);
+	rc = security_get_classes(current_selinux_ns, &classes, &nclasses);
 	if (rc)
 		return rc;
 
diff --git a/security/selinux/ss/avtab.c b/security/selinux/ss/avtab.c
index 2c3c7d0..a2c9148 100644
--- a/security/selinux/ss/avtab.c
+++ b/security/selinux/ss/avtab.c
@@ -655,7 +655,8 @@ int avtab_write(struct policydb *p, struct avtab *a, void *fp)
 
 	return rc;
 }
-void avtab_cache_init(void)
+
+void __init avtab_cache_init(void)
 {
 	avtab_node_cachep = kmem_cache_create("avtab_node",
 					      sizeof(struct avtab_node),
@@ -664,9 +665,3 @@ void avtab_cache_init(void)
 						sizeof(struct avtab_extended_perms),
 						0, SLAB_PANIC, NULL);
 }
-
-void avtab_cache_destroy(void)
-{
-	kmem_cache_destroy(avtab_node_cachep);
-	kmem_cache_destroy(avtab_xperms_cachep);
-}
diff --git a/security/selinux/ss/avtab.h b/security/selinux/ss/avtab.h
index 725853c..0d652fa 100644
--- a/security/selinux/ss/avtab.h
+++ b/security/selinux/ss/avtab.h
@@ -114,9 +114,6 @@ struct avtab_node *avtab_search_node(struct avtab *h, struct avtab_key *key);
 
 struct avtab_node *avtab_search_node_next(struct avtab_node *node, int specified);
 
-void avtab_cache_init(void);
-void avtab_cache_destroy(void);
-
 #define MAX_AVTAB_HASH_BITS 16
 #define MAX_AVTAB_HASH_BUCKETS (1 << MAX_AVTAB_HASH_BITS)
 
diff --git a/security/selinux/ss/ebitmap.c b/security/selinux/ss/ebitmap.c
index fc28149..11dae16 100644
--- a/security/selinux/ss/ebitmap.c
+++ b/security/selinux/ss/ebitmap.c
@@ -522,14 +522,9 @@ int ebitmap_write(struct ebitmap *e, void *fp)
 	return 0;
 }
 
-void ebitmap_cache_init(void)
+void __init ebitmap_cache_init(void)
 {
 	ebitmap_node_cachep = kmem_cache_create("ebitmap_node",
 							sizeof(struct ebitmap_node),
 							0, SLAB_PANIC, NULL);
 }
-
-void ebitmap_cache_destroy(void)
-{
-	kmem_cache_destroy(ebitmap_node_cachep);
-}
diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h
index da1325d..68ebf50 100644
--- a/security/selinux/ss/ebitmap.h
+++ b/security/selinux/ss/ebitmap.h
@@ -130,9 +130,6 @@ void ebitmap_destroy(struct ebitmap *e);
 int ebitmap_read(struct ebitmap *e, void *fp);
 int ebitmap_write(struct ebitmap *e, void *fp);
 
-void ebitmap_cache_init(void);
-void ebitmap_cache_destroy(void);
-
 #ifdef CONFIG_NETLABEL
 int ebitmap_netlbl_export(struct ebitmap *ebmap,
 			  struct netlbl_lsm_catmap **catmap);
diff --git a/security/selinux/ss/hashtab.c b/security/selinux/ss/hashtab.c
index bef7577..2405051 100644
--- a/security/selinux/ss/hashtab.c
+++ b/security/selinux/ss/hashtab.c
@@ -168,14 +168,10 @@ void hashtab_stat(struct hashtab *h, struct hashtab_info *info)
 	info->slots_used = slots_used;
 	info->max_chain_len = max_chain_len;
 }
-void hashtab_cache_init(void)
+
+void __init hashtab_cache_init(void)
 {
 		hashtab_node_cachep = kmem_cache_create("hashtab_node",
 			sizeof(struct hashtab_node),
 			0, SLAB_PANIC, NULL);
 }
-
-void hashtab_cache_destroy(void)
-{
-		kmem_cache_destroy(hashtab_node_cachep);
-}
diff --git a/security/selinux/ss/hashtab.h b/security/selinux/ss/hashtab.h
index d6883d3..009fb5e 100644
--- a/security/selinux/ss/hashtab.h
+++ b/security/selinux/ss/hashtab.h
@@ -84,8 +84,4 @@ int hashtab_map(struct hashtab *h,
 /* Fill info with some hash table statistics */
 void hashtab_stat(struct hashtab *h, struct hashtab_info *info);
 
-/* Use kmem_cache for hashtab_node */
-void hashtab_cache_init(void);
-void hashtab_cache_destroy(void);
-
 #endif	/* _SS_HASHTAB_H */
diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index d9dc34f4..b76f495 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -32,20 +32,20 @@
  * Return the length in bytes for the MLS fields of the
  * security context string representation of `context'.
  */
-int mls_compute_context_len(struct context *context)
+int mls_compute_context_len(struct policydb *p, struct context *context)
 {
 	int i, l, len, head, prev;
 	char *nm;
 	struct ebitmap *e;
 	struct ebitmap_node *node;
 
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return 0;
 
 	len = 1; /* for the beginning ":" */
 	for (l = 0; l < 2; l++) {
 		int index_sens = context->range.level[l].sens;
-		len += strlen(sym_name(&policydb, SYM_LEVELS, index_sens - 1));
+		len += strlen(sym_name(p, SYM_LEVELS, index_sens - 1));
 
 		/* categories */
 		head = -2;
@@ -55,17 +55,17 @@ int mls_compute_context_len(struct context *context)
 			if (i - prev > 1) {
 				/* one or more negative bits are skipped */
 				if (head != prev) {
-					nm = sym_name(&policydb, SYM_CATS, prev);
+					nm = sym_name(p, SYM_CATS, prev);
 					len += strlen(nm) + 1;
 				}
-				nm = sym_name(&policydb, SYM_CATS, i);
+				nm = sym_name(p, SYM_CATS, i);
 				len += strlen(nm) + 1;
 				head = i;
 			}
 			prev = i;
 		}
 		if (prev != head) {
-			nm = sym_name(&policydb, SYM_CATS, prev);
+			nm = sym_name(p, SYM_CATS, prev);
 			len += strlen(nm) + 1;
 		}
 		if (l == 0) {
@@ -85,7 +85,8 @@ int mls_compute_context_len(struct context *context)
  * the MLS fields of `context' into the string `*scontext'.
  * Update `*scontext' to point to the end of the MLS fields.
  */
-void mls_sid_to_context(struct context *context,
+void mls_sid_to_context(struct policydb *p,
+			struct context *context,
 			char **scontext)
 {
 	char *scontextp, *nm;
@@ -93,7 +94,7 @@ void mls_sid_to_context(struct context *context,
 	struct ebitmap *e;
 	struct ebitmap_node *node;
 
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return;
 
 	scontextp = *scontext;
@@ -102,7 +103,7 @@ void mls_sid_to_context(struct context *context,
 	scontextp++;
 
 	for (l = 0; l < 2; l++) {
-		strcpy(scontextp, sym_name(&policydb, SYM_LEVELS,
+		strcpy(scontextp, sym_name(p, SYM_LEVELS,
 					   context->range.level[l].sens - 1));
 		scontextp += strlen(scontextp);
 
@@ -118,7 +119,7 @@ void mls_sid_to_context(struct context *context,
 						*scontextp++ = '.';
 					else
 						*scontextp++ = ',';
-					nm = sym_name(&policydb, SYM_CATS, prev);
+					nm = sym_name(p, SYM_CATS, prev);
 					strcpy(scontextp, nm);
 					scontextp += strlen(nm);
 				}
@@ -126,7 +127,7 @@ void mls_sid_to_context(struct context *context,
 					*scontextp++ = ':';
 				else
 					*scontextp++ = ',';
-				nm = sym_name(&policydb, SYM_CATS, i);
+				nm = sym_name(p, SYM_CATS, i);
 				strcpy(scontextp, nm);
 				scontextp += strlen(nm);
 				head = i;
@@ -139,7 +140,7 @@ void mls_sid_to_context(struct context *context,
 				*scontextp++ = '.';
 			else
 				*scontextp++ = ',';
-			nm = sym_name(&policydb, SYM_CATS, prev);
+			nm = sym_name(p, SYM_CATS, prev);
 			strcpy(scontextp, nm);
 			scontextp += strlen(nm);
 		}
@@ -374,12 +375,13 @@ int mls_context_to_sid(struct policydb *pol,
  * the string `str'.  This function will allocate temporary memory with the
  * given constraints of gfp_mask.
  */
-int mls_from_string(char *str, struct context *context, gfp_t gfp_mask)
+int mls_from_string(struct policydb *p, char *str, struct context *context,
+		    gfp_t gfp_mask)
 {
 	char *tmpstr, *freestr;
 	int rc;
 
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return -EINVAL;
 
 	/* we need freestr because mls_context_to_sid will change
@@ -388,7 +390,7 @@ int mls_from_string(char *str, struct context *context, gfp_t gfp_mask)
 	if (!tmpstr) {
 		rc = -ENOMEM;
 	} else {
-		rc = mls_context_to_sid(&policydb, ':', &tmpstr, context,
+		rc = mls_context_to_sid(p, ':', &tmpstr, context,
 					NULL, SECSID_NULL);
 		kfree(freestr);
 	}
@@ -416,10 +418,11 @@ int mls_range_set(struct context *context,
 	return rc;
 }
 
-int mls_setup_user_range(struct context *fromcon, struct user_datum *user,
+int mls_setup_user_range(struct policydb *p,
+			 struct context *fromcon, struct user_datum *user,
 			 struct context *usercon)
 {
-	if (policydb.mls_enabled) {
+	if (p->mls_enabled) {
 		struct mls_level *fromcon_sen = &(fromcon->range.level[0]);
 		struct mls_level *fromcon_clr = &(fromcon->range.level[1]);
 		struct mls_level *user_low = &(user->range.level[0]);
@@ -469,7 +472,7 @@ int mls_convert_context(struct policydb *oldp,
 	struct ebitmap_node *node;
 	int l, i;
 
-	if (!policydb.mls_enabled)
+	if (!oldp->mls_enabled || !newp->mls_enabled)
 		return 0;
 
 	for (l = 0; l < 2; l++) {
@@ -502,7 +505,8 @@ int mls_convert_context(struct policydb *oldp,
 	return 0;
 }
 
-int mls_compute_sid(struct context *scontext,
+int mls_compute_sid(struct policydb *p,
+		    struct context *scontext,
 		    struct context *tcontext,
 		    u16 tclass,
 		    u32 specified,
@@ -514,7 +518,7 @@ int mls_compute_sid(struct context *scontext,
 	struct class_datum *cladatum;
 	int default_range = 0;
 
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return 0;
 
 	switch (specified) {
@@ -523,12 +527,12 @@ int mls_compute_sid(struct context *scontext,
 		rtr.source_type = scontext->type;
 		rtr.target_type = tcontext->type;
 		rtr.target_class = tclass;
-		r = hashtab_search(policydb.range_tr, &rtr);
+		r = hashtab_search(p->range_tr, &rtr);
 		if (r)
 			return mls_range_set(newcontext, r);
 
-		if (tclass && tclass <= policydb.p_classes.nprim) {
-			cladatum = policydb.class_val_to_struct[tclass - 1];
+		if (tclass && tclass <= p->p_classes.nprim) {
+			cladatum = p->class_val_to_struct[tclass - 1];
 			if (cladatum)
 				default_range = cladatum->default_range;
 		}
@@ -550,7 +554,7 @@ int mls_compute_sid(struct context *scontext,
 
 		/* Fallthrough */
 	case AVTAB_CHANGE:
-		if ((tclass == policydb.process_class) || (sock == true))
+		if ((tclass == p->process_class) || (sock == true))
 			/* Use the process MLS attributes. */
 			return mls_context_cpy(newcontext, scontext);
 		else
@@ -576,10 +580,11 @@ int mls_compute_sid(struct context *scontext,
  * NetLabel MLS sensitivity level field.
  *
  */
-void mls_export_netlbl_lvl(struct context *context,
+void mls_export_netlbl_lvl(struct policydb *p,
+			   struct context *context,
 			   struct netlbl_lsm_secattr *secattr)
 {
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return;
 
 	secattr->attr.mls.lvl = context->range.level[0].sens - 1;
@@ -596,10 +601,11 @@ void mls_export_netlbl_lvl(struct context *context,
  * NetLabel MLS sensitivity level into the context.
  *
  */
-void mls_import_netlbl_lvl(struct context *context,
+void mls_import_netlbl_lvl(struct policydb *p,
+			   struct context *context,
 			   struct netlbl_lsm_secattr *secattr)
 {
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return;
 
 	context->range.level[0].sens = secattr->attr.mls.lvl + 1;
@@ -616,12 +622,13 @@ void mls_import_netlbl_lvl(struct context *context,
  * MLS category field.  Returns zero on success, negative values on failure.
  *
  */
-int mls_export_netlbl_cat(struct context *context,
+int mls_export_netlbl_cat(struct policydb *p,
+			  struct context *context,
 			  struct netlbl_lsm_secattr *secattr)
 {
 	int rc;
 
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return 0;
 
 	rc = ebitmap_netlbl_export(&context->range.level[0].cat,
@@ -644,12 +651,13 @@ int mls_export_netlbl_cat(struct context *context,
  * negative values on failure.
  *
  */
-int mls_import_netlbl_cat(struct context *context,
+int mls_import_netlbl_cat(struct policydb *p,
+			  struct context *context,
 			  struct netlbl_lsm_secattr *secattr)
 {
 	int rc;
 
-	if (!policydb.mls_enabled)
+	if (!p->mls_enabled)
 		return 0;
 
 	rc = ebitmap_netlbl_import(&context->range.level[0].cat,
diff --git a/security/selinux/ss/mls.h b/security/selinux/ss/mls.h
index 0f0a1d6..dbba63e 100644
--- a/security/selinux/ss/mls.h
+++ b/security/selinux/ss/mls.h
@@ -24,8 +24,9 @@
 #include "context.h"
 #include "policydb.h"
 
-int mls_compute_context_len(struct context *context);
-void mls_sid_to_context(struct context *context, char **scontext);
+int mls_compute_context_len(struct policydb *p, struct context *context);
+void mls_sid_to_context(struct policydb *p, struct context *context,
+			char **scontext);
 int mls_context_isvalid(struct policydb *p, struct context *c);
 int mls_range_isvalid(struct policydb *p, struct mls_range *r);
 int mls_level_isvalid(struct policydb *p, struct mls_level *l);
@@ -37,7 +38,8 @@ int mls_context_to_sid(struct policydb *p,
 		       struct sidtab *s,
 		       u32 def_sid);
 
-int mls_from_string(char *str, struct context *context, gfp_t gfp_mask);
+int mls_from_string(struct policydb *p, char *str, struct context *context,
+		    gfp_t gfp_mask);
 
 int mls_range_set(struct context *context, struct mls_range *range);
 
@@ -45,42 +47,52 @@ int mls_convert_context(struct policydb *oldp,
 			struct policydb *newp,
 			struct context *context);
 
-int mls_compute_sid(struct context *scontext,
+int mls_compute_sid(struct policydb *p,
+		    struct context *scontext,
 		    struct context *tcontext,
 		    u16 tclass,
 		    u32 specified,
 		    struct context *newcontext,
 		    bool sock);
 
-int mls_setup_user_range(struct context *fromcon, struct user_datum *user,
+int mls_setup_user_range(struct policydb *p,
+			 struct context *fromcon, struct user_datum *user,
 			 struct context *usercon);
 
 #ifdef CONFIG_NETLABEL
-void mls_export_netlbl_lvl(struct context *context,
+void mls_export_netlbl_lvl(struct policydb *p,
+			   struct context *context,
 			   struct netlbl_lsm_secattr *secattr);
-void mls_import_netlbl_lvl(struct context *context,
+void mls_import_netlbl_lvl(struct policydb *p,
+			   struct context *context,
 			   struct netlbl_lsm_secattr *secattr);
-int mls_export_netlbl_cat(struct context *context,
+int mls_export_netlbl_cat(struct policydb *p,
+			  struct context *context,
 			  struct netlbl_lsm_secattr *secattr);
-int mls_import_netlbl_cat(struct context *context,
+int mls_import_netlbl_cat(struct policydb *p,
+			  struct context *context,
 			  struct netlbl_lsm_secattr *secattr);
 #else
-static inline void mls_export_netlbl_lvl(struct context *context,
+static inline void mls_export_netlbl_lvl(struct policydb *p,
+					 struct context *context,
 					 struct netlbl_lsm_secattr *secattr)
 {
 	return;
 }
-static inline void mls_import_netlbl_lvl(struct context *context,
+static inline void mls_import_netlbl_lvl(struct policydb *p,
+					 struct context *context,
 					 struct netlbl_lsm_secattr *secattr)
 {
 	return;
 }
-static inline int mls_export_netlbl_cat(struct context *context,
+static inline int mls_export_netlbl_cat(struct policydb *p,
+					struct context *context,
 					struct netlbl_lsm_secattr *secattr)
 {
 	return -ENOMEM;
 }
-static inline int mls_import_netlbl_cat(struct context *context,
+static inline int mls_import_netlbl_cat(struct policydb *p,
+					struct context *context,
 					struct netlbl_lsm_secattr *secattr)
 {
 	return -ENOMEM;
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 33cfe5d..bc2eacd 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -80,53 +80,46 @@ char *selinux_policycap_names[__POLICYDB_CAPABILITY_MAX] = {
 	"nnp_nosuid_transition"
 };
 
-int selinux_policycap_netpeer;
-int selinux_policycap_openperm;
-int selinux_policycap_extsockclass;
-int selinux_policycap_alwaysnetwork;
-int selinux_policycap_cgroupseclabel;
-int selinux_policycap_nnp_nosuid_transition;
-
-static DEFINE_RWLOCK(policy_rwlock);
+int selinux_ss_create(struct selinux_ss **ss)
+{
+	struct selinux_ss *newss;
 
-static struct sidtab sidtab;
-struct policydb policydb;
-int ss_initialized;
+	newss = kzalloc(sizeof(*newss), GFP_KERNEL);
+	if (!newss)
+		return -ENOMEM;
+	rwlock_init(&newss->policy_rwlock);
+	mutex_init(&newss->status_lock);
+	*ss = newss;
+	return 0;
+}
 
-/*
- * The largest sequence number that has been used when
- * providing an access decision to the access vector cache.
- * The sequence number only changes when a policy change
- * occurs.
- */
-static u32 latest_granting;
+void selinux_ss_free(struct selinux_ss *ss)
+{
+	sidtab_destroy(&ss->sidtab);
+	policydb_destroy(&ss->policydb);
+	kfree(ss->map.mapping);
+	if (ss->status_page)
+		__free_page(ss->status_page);
+	kfree(ss);
+}
 
 /* Forward declaration. */
-static int context_struct_to_string(struct context *context, char **scontext,
+static int context_struct_to_string(struct policydb *policydb,
+				    struct context *context,
+				    char **scontext,
 				    u32 *scontext_len);
 
-static void context_struct_compute_av(struct context *scontext,
-					struct context *tcontext,
-					u16 tclass,
-					struct av_decision *avd,
-					struct extended_perms *xperms);
-
-struct selinux_mapping {
-	u16 value; /* policy value */
-	unsigned num_perms;
-	u32 perms[sizeof(u32) * 8];
-};
-
-static struct selinux_mapping *current_mapping;
-static u16 current_mapping_size;
+static void context_struct_compute_av(struct policydb *policydb,
+				      struct context *scontext,
+				      struct context *tcontext,
+				      u16 tclass,
+				      struct av_decision *avd,
+				      struct extended_perms *xperms);
 
 static int selinux_set_mapping(struct policydb *pol,
 			       struct security_class_mapping *map,
-			       struct selinux_mapping **out_map_p,
-			       u16 *out_map_size)
+			       struct selinux_map *out_map)
 {
-	struct selinux_mapping *out_map = NULL;
-	size_t size = sizeof(struct selinux_mapping);
 	u16 i, j;
 	unsigned k;
 	bool print_unknown_handle = false;
@@ -139,15 +132,15 @@ static int selinux_set_mapping(struct policydb *pol,
 		i++;
 
 	/* Allocate space for the class records, plus one for class zero */
-	out_map = kcalloc(++i, size, GFP_ATOMIC);
-	if (!out_map)
+	out_map->mapping = kcalloc(++i, sizeof(*out_map->mapping), GFP_ATOMIC);
+	if (!out_map->mapping)
 		return -ENOMEM;
 
 	/* Store the raw class and permission values */
 	j = 0;
 	while (map[j].name) {
 		struct security_class_mapping *p_in = map + (j++);
-		struct selinux_mapping *p_out = out_map + j;
+		struct selinux_mapping *p_out = out_map->mapping + j;
 
 		/* An empty class string skips ahead */
 		if (!strcmp(p_in->name, "")) {
@@ -194,11 +187,11 @@ static int selinux_set_mapping(struct policydb *pol,
 		printk(KERN_INFO "SELinux: the above unknown classes and permissions will be %s\n",
 		       pol->allow_unknown ? "allowed" : "denied");
 
-	*out_map_p = out_map;
-	*out_map_size = i;
+	out_map->size = i;
 	return 0;
 err:
-	kfree(out_map);
+	kfree(out_map->mapping);
+	out_map->mapping = NULL;
 	return -EINVAL;
 }
 
@@ -206,10 +199,10 @@ static int selinux_set_mapping(struct policydb *pol,
  * Get real, policy values from mapped values
  */
 
-static u16 unmap_class(u16 tclass)
+static u16 unmap_class(struct selinux_map *map, u16 tclass)
 {
-	if (tclass < current_mapping_size)
-		return current_mapping[tclass].value;
+	if (tclass < map->size)
+		return map->mapping[tclass].value;
 
 	return tclass;
 }
@@ -217,42 +210,44 @@ static u16 unmap_class(u16 tclass)
 /*
  * Get kernel value for class from its policy value
  */
-static u16 map_class(u16 pol_value)
+static u16 map_class(struct selinux_map *map, u16 pol_value)
 {
 	u16 i;
 
-	for (i = 1; i < current_mapping_size; i++) {
-		if (current_mapping[i].value == pol_value)
+	for (i = 1; i < map->size; i++) {
+		if (map->mapping[i].value == pol_value)
 			return i;
 	}
 
 	return SECCLASS_NULL;
 }
 
-static void map_decision(u16 tclass, struct av_decision *avd,
+static void map_decision(struct selinux_map *map,
+			 u16 tclass, struct av_decision *avd,
 			 int allow_unknown)
 {
-	if (tclass < current_mapping_size) {
-		unsigned i, n = current_mapping[tclass].num_perms;
+	if (tclass < map->size) {
+		struct selinux_mapping *mapping = &map->mapping[tclass];
+		unsigned int i, n = mapping->num_perms;
 		u32 result;
 
 		for (i = 0, result = 0; i < n; i++) {
-			if (avd->allowed & current_mapping[tclass].perms[i])
+			if (avd->allowed & mapping->perms[i])
 				result |= 1<<i;
-			if (allow_unknown && !current_mapping[tclass].perms[i])
+			if (allow_unknown && !mapping->perms[i])
 				result |= 1<<i;
 		}
 		avd->allowed = result;
 
 		for (i = 0, result = 0; i < n; i++)
-			if (avd->auditallow & current_mapping[tclass].perms[i])
+			if (avd->auditallow & mapping->perms[i])
 				result |= 1<<i;
 		avd->auditallow = result;
 
 		for (i = 0, result = 0; i < n; i++) {
-			if (avd->auditdeny & current_mapping[tclass].perms[i])
+			if (avd->auditdeny & mapping->perms[i])
 				result |= 1<<i;
-			if (!allow_unknown && !current_mapping[tclass].perms[i])
+			if (!allow_unknown && !mapping->perms[i])
 				result |= 1<<i;
 		}
 		/*
@@ -266,9 +261,11 @@ static void map_decision(u16 tclass, struct av_decision *avd,
 	}
 }
 
-int security_mls_enabled(void)
+int security_mls_enabled(struct selinux_ns *ns)
 {
-	return policydb.mls_enabled;
+	struct policydb *p = &ns->ss->policydb;
+
+	return p->mls_enabled;
 }
 
 /*
@@ -282,7 +279,8 @@ int security_mls_enabled(void)
  * of the process performing the transition.  All other callers of
  * constraint_expr_eval should pass in NULL for xcontext.
  */
-static int constraint_expr_eval(struct context *scontext,
+static int constraint_expr_eval(struct policydb *policydb,
+				struct context *scontext,
 				struct context *tcontext,
 				struct context *xcontext,
 				struct constraint_expr *cexpr)
@@ -326,8 +324,8 @@ static int constraint_expr_eval(struct context *scontext,
 			case CEXPR_ROLE:
 				val1 = scontext->role;
 				val2 = tcontext->role;
-				r1 = policydb.role_val_to_struct[val1 - 1];
-				r2 = policydb.role_val_to_struct[val2 - 1];
+				r1 = policydb->role_val_to_struct[val1 - 1];
+				r2 = policydb->role_val_to_struct[val2 - 1];
 				switch (e->op) {
 				case CEXPR_DOM:
 					s[++sp] = ebitmap_get_bit(&r1->dominates,
@@ -472,7 +470,8 @@ static int dump_masked_av_helper(void *k, void *d, void *args)
 	return 0;
 }
 
-static void security_dump_masked_av(struct context *scontext,
+static void security_dump_masked_av(struct policydb *policydb,
+				    struct context *scontext,
 				    struct context *tcontext,
 				    u16 tclass,
 				    u32 permissions,
@@ -492,8 +491,8 @@ static void security_dump_masked_av(struct context *scontext,
 	if (!permissions)
 		return;
 
-	tclass_name = sym_name(&policydb, SYM_CLASSES, tclass - 1);
-	tclass_dat = policydb.class_val_to_struct[tclass - 1];
+	tclass_name = sym_name(policydb, SYM_CLASSES, tclass - 1);
+	tclass_dat = policydb->class_val_to_struct[tclass - 1];
 	common_dat = tclass_dat->comdatum;
 
 	/* init permission_names */
@@ -507,11 +506,11 @@ static void security_dump_masked_av(struct context *scontext,
 		goto out;
 
 	/* get scontext/tcontext in text form */
-	if (context_struct_to_string(scontext,
+	if (context_struct_to_string(policydb, scontext,
 				     &scontext_name, &length) < 0)
 		goto out;
 
-	if (context_struct_to_string(tcontext,
+	if (context_struct_to_string(policydb, tcontext,
 				     &tcontext_name, &length) < 0)
 		goto out;
 
@@ -550,7 +549,8 @@ static void security_dump_masked_av(struct context *scontext,
  * security_boundary_permission - drops violated permissions
  * on boundary constraint.
  */
-static void type_attribute_bounds_av(struct context *scontext,
+static void type_attribute_bounds_av(struct policydb *policydb,
+				     struct context *scontext,
 				     struct context *tcontext,
 				     u16 tclass,
 				     struct av_decision *avd)
@@ -562,14 +562,14 @@ static void type_attribute_bounds_av(struct context *scontext,
 	struct type_datum *target;
 	u32 masked = 0;
 
-	source = flex_array_get_ptr(policydb.type_val_to_struct_array,
+	source = flex_array_get_ptr(policydb->type_val_to_struct_array,
 				    scontext->type - 1);
 	BUG_ON(!source);
 
 	if (!source->bounds)
 		return;
 
-	target = flex_array_get_ptr(policydb.type_val_to_struct_array,
+	target = flex_array_get_ptr(policydb->type_val_to_struct_array,
 				    tcontext->type - 1);
 	BUG_ON(!target);
 
@@ -584,7 +584,7 @@ static void type_attribute_bounds_av(struct context *scontext,
 		tcontextp = &lo_tcontext;
 	}
 
-	context_struct_compute_av(&lo_scontext,
+	context_struct_compute_av(policydb, &lo_scontext,
 				  tcontextp,
 				  tclass,
 				  &lo_avd,
@@ -599,7 +599,7 @@ static void type_attribute_bounds_av(struct context *scontext,
 	avd->allowed &= ~masked;
 
 	/* audit masked permissions */
-	security_dump_masked_av(scontext, tcontext,
+	security_dump_masked_av(policydb, scontext, tcontext,
 				tclass, masked, "bounds");
 }
 
@@ -632,11 +632,12 @@ void services_compute_xperms_drivers(
  * Compute access vectors and extended permissions based on a context
  * structure pair for the permissions in a particular class.
  */
-static void context_struct_compute_av(struct context *scontext,
-					struct context *tcontext,
-					u16 tclass,
-					struct av_decision *avd,
-					struct extended_perms *xperms)
+static void context_struct_compute_av(struct policydb *policydb,
+				      struct context *scontext,
+				      struct context *tcontext,
+				      u16 tclass,
+				      struct av_decision *avd,
+				      struct extended_perms *xperms)
 {
 	struct constraint_node *constraint;
 	struct role_allow *ra;
@@ -655,13 +656,13 @@ static void context_struct_compute_av(struct context *scontext,
 		xperms->len = 0;
 	}
 
-	if (unlikely(!tclass || tclass > policydb.p_classes.nprim)) {
+	if (unlikely(!tclass || tclass > policydb->p_classes.nprim)) {
 		if (printk_ratelimit())
 			printk(KERN_WARNING "SELinux:  Invalid class %hu\n", tclass);
 		return;
 	}
 
-	tclass_datum = policydb.class_val_to_struct[tclass - 1];
+	tclass_datum = policydb->class_val_to_struct[tclass - 1];
 
 	/*
 	 * If a specific type enforcement rule was defined for
@@ -669,15 +670,18 @@ static void context_struct_compute_av(struct context *scontext,
 	 */
 	avkey.target_class = tclass;
 	avkey.specified = AVTAB_AV | AVTAB_XPERMS;
-	sattr = flex_array_get(policydb.type_attr_map_array, scontext->type - 1);
+	sattr = flex_array_get(policydb->type_attr_map_array,
+			       scontext->type - 1);
 	BUG_ON(!sattr);
-	tattr = flex_array_get(policydb.type_attr_map_array, tcontext->type - 1);
+	tattr = flex_array_get(policydb->type_attr_map_array,
+			       tcontext->type - 1);
 	BUG_ON(!tattr);
 	ebitmap_for_each_positive_bit(sattr, snode, i) {
 		ebitmap_for_each_positive_bit(tattr, tnode, j) {
 			avkey.source_type = i + 1;
 			avkey.target_type = j + 1;
-			for (node = avtab_search_node(&policydb.te_avtab, &avkey);
+			for (node = avtab_search_node(&policydb->te_avtab,
+						      &avkey);
 			     node;
 			     node = avtab_search_node_next(node, avkey.specified)) {
 				if (node->key.specified == AVTAB_ALLOWED)
@@ -691,7 +695,7 @@ static void context_struct_compute_av(struct context *scontext,
 			}
 
 			/* Check conditional av table for additional permissions */
-			cond_compute_av(&policydb.te_cond_avtab, &avkey,
+			cond_compute_av(&policydb->te_cond_avtab, &avkey,
 					avd, xperms);
 
 		}
@@ -704,7 +708,7 @@ static void context_struct_compute_av(struct context *scontext,
 	constraint = tclass_datum->constraints;
 	while (constraint) {
 		if ((constraint->permissions & (avd->allowed)) &&
-		    !constraint_expr_eval(scontext, tcontext, NULL,
+		    !constraint_expr_eval(policydb, scontext, tcontext, NULL,
 					  constraint->expr)) {
 			avd->allowed &= ~(constraint->permissions);
 		}
@@ -716,16 +720,16 @@ static void context_struct_compute_av(struct context *scontext,
 	 * role is changing, then check the (current_role, new_role)
 	 * pair.
 	 */
-	if (tclass == policydb.process_class &&
-	    (avd->allowed & policydb.process_trans_perms) &&
+	if (tclass == policydb->process_class &&
+	    (avd->allowed & policydb->process_trans_perms) &&
 	    scontext->role != tcontext->role) {
-		for (ra = policydb.role_allow; ra; ra = ra->next) {
+		for (ra = policydb->role_allow; ra; ra = ra->next) {
 			if (scontext->role == ra->role &&
 			    tcontext->role == ra->new_role)
 				break;
 		}
 		if (!ra)
-			avd->allowed &= ~policydb.process_trans_perms;
+			avd->allowed &= ~policydb->process_trans_perms;
 	}
 
 	/*
@@ -733,41 +737,46 @@ static void context_struct_compute_av(struct context *scontext,
 	 * constraint, lazy checks have to mask any violated
 	 * permission and notice it to userspace via audit.
 	 */
-	type_attribute_bounds_av(scontext, tcontext,
+	type_attribute_bounds_av(policydb, scontext, tcontext,
 				 tclass, avd);
 }
 
-static int security_validtrans_handle_fail(struct context *ocontext,
+static int security_validtrans_handle_fail(struct selinux_ns *ns,
+					   struct context *ocontext,
 					   struct context *ncontext,
 					   struct context *tcontext,
 					   u16 tclass)
 {
+	struct policydb *p = &ns->ss->policydb;
 	char *o = NULL, *n = NULL, *t = NULL;
 	u32 olen, nlen, tlen;
 
-	if (context_struct_to_string(ocontext, &o, &olen))
+	if (context_struct_to_string(p, ocontext, &o, &olen))
 		goto out;
-	if (context_struct_to_string(ncontext, &n, &nlen))
+	if (context_struct_to_string(p, ncontext, &n, &nlen))
 		goto out;
-	if (context_struct_to_string(tcontext, &t, &tlen))
+	if (context_struct_to_string(p, tcontext, &t, &tlen))
 		goto out;
 	audit_log(current->audit_context, GFP_ATOMIC, AUDIT_SELINUX_ERR,
 		  "op=security_validate_transition seresult=denied"
 		  " oldcontext=%s newcontext=%s taskcontext=%s tclass=%s",
-		  o, n, t, sym_name(&policydb, SYM_CLASSES, tclass-1));
+		  o, n, t, sym_name(p, SYM_CLASSES, tclass-1));
 out:
 	kfree(o);
 	kfree(n);
 	kfree(t);
 
-	if (!selinux_enforcing)
+	if (!ns_enforcing(ns))
 		return 0;
 	return -EPERM;
 }
 
-static int security_compute_validatetrans(u32 oldsid, u32 newsid, u32 tasksid,
+static int security_compute_validatetrans(struct selinux_ns *ns,
+					  u32 oldsid, u32 newsid, u32 tasksid,
 					  u16 orig_tclass, bool user)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct context *ocontext;
 	struct context *ncontext;
 	struct context *tcontext;
@@ -776,23 +785,27 @@ static int security_compute_validatetrans(u32 oldsid, u32 newsid, u32 tasksid,
 	u16 tclass;
 	int rc = 0;
 
-	if (!ss_initialized)
+
+	if (!ns->initialized)
 		return 0;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 
 	if (!user)
-		tclass = unmap_class(orig_tclass);
+		tclass = unmap_class(&ns->ss->map, orig_tclass);
 	else
 		tclass = orig_tclass;
 
-	if (!tclass || tclass > policydb.p_classes.nprim) {
+	if (!tclass || tclass > policydb->p_classes.nprim) {
 		rc = -EINVAL;
 		goto out;
 	}
-	tclass_datum = policydb.class_val_to_struct[tclass - 1];
+	tclass_datum = policydb->class_val_to_struct[tclass - 1];
 
-	ocontext = sidtab_search(&sidtab, oldsid);
+	ocontext = sidtab_search(sidtab, oldsid);
 	if (!ocontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 			__func__, oldsid);
@@ -800,7 +813,7 @@ static int security_compute_validatetrans(u32 oldsid, u32 newsid, u32 tasksid,
 		goto out;
 	}
 
-	ncontext = sidtab_search(&sidtab, newsid);
+	ncontext = sidtab_search(sidtab, newsid);
 	if (!ncontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 			__func__, newsid);
@@ -808,7 +821,7 @@ static int security_compute_validatetrans(u32 oldsid, u32 newsid, u32 tasksid,
 		goto out;
 	}
 
-	tcontext = sidtab_search(&sidtab, tasksid);
+	tcontext = sidtab_search(sidtab, tasksid);
 	if (!tcontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 			__func__, tasksid);
@@ -818,12 +831,13 @@ static int security_compute_validatetrans(u32 oldsid, u32 newsid, u32 tasksid,
 
 	constraint = tclass_datum->validatetrans;
 	while (constraint) {
-		if (!constraint_expr_eval(ocontext, ncontext, tcontext,
-					  constraint->expr)) {
+		if (!constraint_expr_eval(policydb, ocontext, ncontext,
+					  tcontext, constraint->expr)) {
 			if (user)
 				rc = -EPERM;
 			else
-				rc = security_validtrans_handle_fail(ocontext,
+				rc = security_validtrans_handle_fail(ns,
+								     ocontext,
 								     ncontext,
 								     tcontext,
 								     tclass);
@@ -833,22 +847,24 @@ static int security_compute_validatetrans(u32 oldsid, u32 newsid, u32 tasksid,
 	}
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
-int security_validate_transition_user(u32 oldsid, u32 newsid, u32 tasksid,
-					u16 tclass)
+int security_validate_transition_user(struct selinux_ns *ns,
+				      u32 oldsid, u32 newsid, u32 tasksid,
+				      u16 tclass)
 {
-	return security_compute_validatetrans(oldsid, newsid, tasksid,
-						tclass, true);
+	return security_compute_validatetrans(ns, oldsid, newsid, tasksid,
+					      tclass, true);
 }
 
-int security_validate_transition(u32 oldsid, u32 newsid, u32 tasksid,
+int security_validate_transition(struct selinux_ns *ns,
+				 u32 oldsid, u32 newsid, u32 tasksid,
 				 u16 orig_tclass)
 {
-	return security_compute_validatetrans(oldsid, newsid, tasksid,
-						orig_tclass, false);
+	return security_compute_validatetrans(ns, oldsid, newsid, tasksid,
+					      orig_tclass, false);
 }
 
 /*
@@ -860,17 +876,23 @@ int security_validate_transition(u32 oldsid, u32 newsid, u32 tasksid,
  * @oldsid : current security identifier
  * @newsid : destinated security identifier
  */
-int security_bounded_transition(u32 old_sid, u32 new_sid)
+int security_bounded_transition(struct selinux_ns *ns,
+				u32 old_sid, u32 new_sid)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct context *old_context, *new_context;
 	struct type_datum *type;
 	int index;
 	int rc;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 
 	rc = -EINVAL;
-	old_context = sidtab_search(&sidtab, old_sid);
+	old_context = sidtab_search(sidtab, old_sid);
 	if (!old_context) {
 		printk(KERN_ERR "SELinux: %s: unrecognized SID %u\n",
 		       __func__, old_sid);
@@ -878,7 +900,7 @@ int security_bounded_transition(u32 old_sid, u32 new_sid)
 	}
 
 	rc = -EINVAL;
-	new_context = sidtab_search(&sidtab, new_sid);
+	new_context = sidtab_search(sidtab, new_sid);
 	if (!new_context) {
 		printk(KERN_ERR "SELinux: %s: unrecognized SID %u\n",
 		       __func__, new_sid);
@@ -892,7 +914,7 @@ int security_bounded_transition(u32 old_sid, u32 new_sid)
 
 	index = new_context->type;
 	while (true) {
-		type = flex_array_get_ptr(policydb.type_val_to_struct_array,
+		type = flex_array_get_ptr(policydb->type_val_to_struct_array,
 					  index - 1);
 		BUG_ON(!type);
 
@@ -914,9 +936,9 @@ int security_bounded_transition(u32 old_sid, u32 new_sid)
 		char *new_name = NULL;
 		u32 length;
 
-		if (!context_struct_to_string(old_context,
+		if (!context_struct_to_string(policydb, old_context,
 					      &old_name, &length) &&
-		    !context_struct_to_string(new_context,
+		    !context_struct_to_string(policydb, new_context,
 					      &new_name, &length)) {
 			audit_log(current->audit_context,
 				  GFP_ATOMIC, AUDIT_SELINUX_ERR,
@@ -929,17 +951,17 @@ int security_bounded_transition(u32 old_sid, u32 new_sid)
 		kfree(old_name);
 	}
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 
 	return rc;
 }
 
-static void avd_init(struct av_decision *avd)
+static void avd_init(struct selinux_ns *ns, struct av_decision *avd)
 {
 	avd->allowed = 0;
 	avd->auditallow = 0;
 	avd->auditdeny = 0xffffffff;
-	avd->seqno = latest_granting;
+	avd->seqno = ns->ss->latest_granting;
 	avd->flags = 0;
 }
 
@@ -997,12 +1019,15 @@ void services_compute_xperms_decision(struct extended_perms_decision *xpermd,
 	}
 }
 
-void security_compute_xperms_decision(u32 ssid,
-				u32 tsid,
-				u16 orig_tclass,
-				u8 driver,
-				struct extended_perms_decision *xpermd)
+void security_compute_xperms_decision(struct selinux_ns *ns,
+				      u32 ssid,
+				      u32 tsid,
+				      u16 orig_tclass,
+				      u8 driver,
+				      struct extended_perms_decision *xpermd)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	u16 tclass;
 	struct context *scontext, *tcontext;
 	struct avtab_key avkey;
@@ -1017,60 +1042,64 @@ void security_compute_xperms_decision(u32 ssid,
 	memset(xpermd->auditallow->p, 0, sizeof(xpermd->auditallow->p));
 	memset(xpermd->dontaudit->p, 0, sizeof(xpermd->dontaudit->p));
 
-	read_lock(&policy_rwlock);
-	if (!ss_initialized)
+	read_lock(&ns->ss->policy_rwlock);
+	if (!ns->initialized)
 		goto allow;
 
-	scontext = sidtab_search(&sidtab, ssid);
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, ssid);
 		goto out;
 	}
 
-	tcontext = sidtab_search(&sidtab, tsid);
+	tcontext = sidtab_search(sidtab, tsid);
 	if (!tcontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, tsid);
 		goto out;
 	}
 
-	tclass = unmap_class(orig_tclass);
+	tclass = unmap_class(&ns->ss->map, orig_tclass);
 	if (unlikely(orig_tclass && !tclass)) {
-		if (policydb.allow_unknown)
+		if (policydb->allow_unknown)
 			goto allow;
 		goto out;
 	}
 
 
-	if (unlikely(!tclass || tclass > policydb.p_classes.nprim)) {
+	if (unlikely(!tclass || tclass > policydb->p_classes.nprim)) {
 		pr_warn_ratelimited("SELinux:  Invalid class %hu\n", tclass);
 		goto out;
 	}
 
 	avkey.target_class = tclass;
 	avkey.specified = AVTAB_XPERMS;
-	sattr = flex_array_get(policydb.type_attr_map_array,
+	sattr = flex_array_get(policydb->type_attr_map_array,
 				scontext->type - 1);
 	BUG_ON(!sattr);
-	tattr = flex_array_get(policydb.type_attr_map_array,
+	tattr = flex_array_get(policydb->type_attr_map_array,
 				tcontext->type - 1);
 	BUG_ON(!tattr);
 	ebitmap_for_each_positive_bit(sattr, snode, i) {
 		ebitmap_for_each_positive_bit(tattr, tnode, j) {
 			avkey.source_type = i + 1;
 			avkey.target_type = j + 1;
-			for (node = avtab_search_node(&policydb.te_avtab, &avkey);
+			for (node = avtab_search_node(&policydb->te_avtab,
+						      &avkey);
 			     node;
 			     node = avtab_search_node_next(node, avkey.specified))
 				services_compute_xperms_decision(xpermd, node);
 
-			cond_compute_xperms(&policydb.te_cond_avtab,
+			cond_compute_xperms(&policydb->te_cond_avtab,
 						&avkey, xpermd);
 		}
 	}
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return;
 allow:
 	memset(xpermd->allowed->p, 0xff, sizeof(xpermd->allowed->p));
@@ -1088,22 +1117,28 @@ void security_compute_xperms_decision(u32 ssid,
  * Compute a set of access vector decisions based on the
  * SID pair (@ssid, @tsid) for the permissions in @tclass.
  */
-void security_compute_av(u32 ssid,
+void security_compute_av(struct selinux_ns *ns,
+			 u32 ssid,
 			 u32 tsid,
 			 u16 orig_tclass,
 			 struct av_decision *avd,
 			 struct extended_perms *xperms)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	u16 tclass;
 	struct context *scontext = NULL, *tcontext = NULL;
 
-	read_lock(&policy_rwlock);
-	avd_init(avd);
+	read_lock(&ns->ss->policy_rwlock);
+	avd_init(ns, avd);
 	xperms->len = 0;
-	if (!ss_initialized)
+	if (!ns->initialized)
 		goto allow;
 
-	scontext = sidtab_search(&sidtab, ssid);
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, ssid);
@@ -1111,45 +1146,52 @@ void security_compute_av(u32 ssid,
 	}
 
 	/* permissive domain? */
-	if (ebitmap_get_bit(&policydb.permissive_map, scontext->type))
+	if (ebitmap_get_bit(&policydb->permissive_map, scontext->type))
 		avd->flags |= AVD_FLAGS_PERMISSIVE;
 
-	tcontext = sidtab_search(&sidtab, tsid);
+	tcontext = sidtab_search(sidtab, tsid);
 	if (!tcontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, tsid);
 		goto out;
 	}
 
-	tclass = unmap_class(orig_tclass);
+	tclass = unmap_class(&ns->ss->map, orig_tclass);
 	if (unlikely(orig_tclass && !tclass)) {
-		if (policydb.allow_unknown)
+		if (policydb->allow_unknown)
 			goto allow;
 		goto out;
 	}
-	context_struct_compute_av(scontext, tcontext, tclass, avd, xperms);
-	map_decision(orig_tclass, avd, policydb.allow_unknown);
+	context_struct_compute_av(policydb, scontext, tcontext, tclass, avd,
+				  xperms);
+	map_decision(&ns->ss->map, orig_tclass, avd, policydb->allow_unknown);
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return;
 allow:
 	avd->allowed = 0xffffffff;
 	goto out;
 }
 
-void security_compute_av_user(u32 ssid,
+void security_compute_av_user(struct selinux_ns *ns,
+			      u32 ssid,
 			      u32 tsid,
 			      u16 tclass,
 			      struct av_decision *avd)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct context *scontext = NULL, *tcontext = NULL;
 
-	read_lock(&policy_rwlock);
-	avd_init(avd);
-	if (!ss_initialized)
+	read_lock(&ns->ss->policy_rwlock);
+	avd_init(ns, avd);
+	if (!ns->initialized)
 		goto allow;
 
-	scontext = sidtab_search(&sidtab, ssid);
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, ssid);
@@ -1157,10 +1199,10 @@ void security_compute_av_user(u32 ssid,
 	}
 
 	/* permissive domain? */
-	if (ebitmap_get_bit(&policydb.permissive_map, scontext->type))
+	if (ebitmap_get_bit(&policydb->permissive_map, scontext->type))
 		avd->flags |= AVD_FLAGS_PERMISSIVE;
 
-	tcontext = sidtab_search(&sidtab, tsid);
+	tcontext = sidtab_search(sidtab, tsid);
 	if (!tcontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, tsid);
@@ -1168,14 +1210,15 @@ void security_compute_av_user(u32 ssid,
 	}
 
 	if (unlikely(!tclass)) {
-		if (policydb.allow_unknown)
+		if (policydb->allow_unknown)
 			goto allow;
 		goto out;
 	}
 
-	context_struct_compute_av(scontext, tcontext, tclass, avd, NULL);
+	context_struct_compute_av(policydb, scontext, tcontext, tclass, avd,
+				  NULL);
  out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return;
 allow:
 	avd->allowed = 0xffffffff;
@@ -1189,7 +1232,9 @@ void security_compute_av_user(u32 ssid,
  * to point to this string and set `*scontext_len' to
  * the length of the string.
  */
-static int context_struct_to_string(struct context *context, char **scontext, u32 *scontext_len)
+static int context_struct_to_string(struct policydb *p,
+				    struct context *context,
+				    char **scontext, u32 *scontext_len)
 {
 	char *scontextp;
 
@@ -1208,10 +1253,10 @@ static int context_struct_to_string(struct context *context, char **scontext, u3
 	}
 
 	/* Compute the size of the context. */
-	*scontext_len += strlen(sym_name(&policydb, SYM_USERS, context->user - 1)) + 1;
-	*scontext_len += strlen(sym_name(&policydb, SYM_ROLES, context->role - 1)) + 1;
-	*scontext_len += strlen(sym_name(&policydb, SYM_TYPES, context->type - 1)) + 1;
-	*scontext_len += mls_compute_context_len(context);
+	*scontext_len += strlen(sym_name(p, SYM_USERS, context->user - 1)) + 1;
+	*scontext_len += strlen(sym_name(p, SYM_ROLES, context->role - 1)) + 1;
+	*scontext_len += strlen(sym_name(p, SYM_TYPES, context->type - 1)) + 1;
+	*scontext_len += mls_compute_context_len(p, context);
 
 	if (!scontext)
 		return 0;
@@ -1226,11 +1271,11 @@ static int context_struct_to_string(struct context *context, char **scontext, u3
 	 * Copy the user name, role name and type name into the context.
 	 */
 	scontextp += sprintf(scontextp, "%s:%s:%s",
-		sym_name(&policydb, SYM_USERS, context->user - 1),
-		sym_name(&policydb, SYM_ROLES, context->role - 1),
-		sym_name(&policydb, SYM_TYPES, context->type - 1));
+		sym_name(p, SYM_USERS, context->user - 1),
+		sym_name(p, SYM_ROLES, context->role - 1),
+		sym_name(p, SYM_TYPES, context->type - 1));
 
-	mls_sid_to_context(context, &scontextp);
+	mls_sid_to_context(p, context, &scontextp);
 
 	*scontextp = 0;
 
@@ -1246,9 +1291,12 @@ const char *security_get_initial_sid_context(u32 sid)
 	return initial_sid_to_string[sid];
 }
 
-static int security_sid_to_context_core(u32 sid, char **scontext,
+static int security_sid_to_context_core(struct selinux_ns *ns,
+					u32 sid, char **scontext,
 					u32 *scontext_len, int force)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct context *context;
 	int rc = 0;
 
@@ -1256,7 +1304,7 @@ static int security_sid_to_context_core(u32 sid, char **scontext,
 		*scontext = NULL;
 	*scontext_len  = 0;
 
-	if (!ss_initialized) {
+	if (!ns->initialized) {
 		if (sid <= SECINITSID_NUM) {
 			char *scontextp;
 
@@ -1277,20 +1325,23 @@ static int security_sid_to_context_core(u32 sid, char **scontext,
 		rc = -EINVAL;
 		goto out;
 	}
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 	if (force)
-		context = sidtab_search_force(&sidtab, sid);
+		context = sidtab_search_force(sidtab, sid);
 	else
-		context = sidtab_search(&sidtab, sid);
+		context = sidtab_search(sidtab, sid);
 	if (!context) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 			__func__, sid);
 		rc = -EINVAL;
 		goto out_unlock;
 	}
-	rc = context_struct_to_string(context, scontext, scontext_len);
+	rc = context_struct_to_string(policydb, context, scontext,
+				      scontext_len);
 out_unlock:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 out:
 	return rc;
 
@@ -1306,14 +1357,18 @@ static int security_sid_to_context_core(u32 sid, char **scontext,
  * into a dynamically allocated string of the correct size.  Set @scontext
  * to point to this string and set @scontext_len to the length of the string.
  */
-int security_sid_to_context(u32 sid, char **scontext, u32 *scontext_len)
+int security_sid_to_context(struct selinux_ns *ns,
+			    u32 sid, char **scontext, u32 *scontext_len)
 {
-	return security_sid_to_context_core(sid, scontext, scontext_len, 0);
+	return security_sid_to_context_core(ns, sid, scontext,
+					    scontext_len, 0);
 }
 
-int security_sid_to_context_force(u32 sid, char **scontext, u32 *scontext_len)
+int security_sid_to_context_force(struct selinux_ns *ns, u32 sid,
+				  char **scontext, u32 *scontext_len)
 {
-	return security_sid_to_context_core(sid, scontext, scontext_len, 1);
+	return security_sid_to_context_core(ns, sid, scontext,
+					    scontext_len, 1);
 }
 
 /*
@@ -1401,10 +1456,13 @@ static int string_to_context_struct(struct policydb *pol,
 	return rc;
 }
 
-static int security_context_to_sid_core(const char *scontext, u32 scontext_len,
+static int security_context_to_sid_core(struct selinux_ns *ns,
+					const char *scontext, u32 scontext_len,
 					u32 *sid, u32 def_sid, gfp_t gfp_flags,
 					int force)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	char *scontext2, *str = NULL;
 	struct context context;
 	int rc = 0;
@@ -1413,7 +1471,7 @@ static int security_context_to_sid_core(const char *scontext, u32 scontext_len,
 	if (!scontext_len)
 		return -EINVAL;
 
-	if (!ss_initialized) {
+	if (!ns->initialized) {
 		int i;
 
 		for (i = 1; i < SECINITSID_NUM; i++) {
@@ -1441,9 +1499,10 @@ static int security_context_to_sid_core(const char *scontext, u32 scontext_len,
 		if (!str)
 			goto out;
 	}
-
-	read_lock(&policy_rwlock);
-	rc = string_to_context_struct(&policydb, &sidtab, scontext2,
+	read_lock(&ns->ss->policy_rwlock);
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+	rc = string_to_context_struct(policydb, sidtab, scontext2,
 				      scontext_len, &context, def_sid);
 	if (rc == -EINVAL && force) {
 		context.str = str;
@@ -1451,10 +1510,10 @@ static int security_context_to_sid_core(const char *scontext, u32 scontext_len,
 		str = NULL;
 	} else if (rc)
 		goto out_unlock;
-	rc = sidtab_context_to_sid(&sidtab, &context, sid);
+	rc = sidtab_context_to_sid(sidtab, &context, sid);
 	context_destroy(&context);
 out_unlock:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 out:
 	kfree(scontext2);
 	kfree(str);
@@ -1473,16 +1532,19 @@ static int security_context_to_sid_core(const char *scontext, u32 scontext_len,
  * Returns -%EINVAL if the context is invalid, -%ENOMEM if insufficient
  * memory is available, or 0 on success.
  */
-int security_context_to_sid(const char *scontext, u32 scontext_len, u32 *sid,
+int security_context_to_sid(struct selinux_ns *ns,
+			    const char *scontext, u32 scontext_len, u32 *sid,
 			    gfp_t gfp)
 {
-	return security_context_to_sid_core(scontext, scontext_len,
+	return security_context_to_sid_core(ns, scontext, scontext_len,
 					    sid, SECSID_NULL, gfp, 0);
 }
 
-int security_context_str_to_sid(const char *scontext, u32 *sid, gfp_t gfp)
+int security_context_str_to_sid(struct selinux_ns *ns,
+				const char *scontext, u32 *sid, gfp_t gfp)
 {
-	return security_context_to_sid(scontext, strlen(scontext), sid, gfp);
+	return security_context_to_sid(ns, scontext, strlen(scontext),
+				       sid, gfp);
 }
 
 /**
@@ -1503,51 +1565,56 @@ int security_context_str_to_sid(const char *scontext, u32 *sid, gfp_t gfp)
  * Returns -%EINVAL if the context is invalid, -%ENOMEM if insufficient
  * memory is available, or 0 on success.
  */
-int security_context_to_sid_default(const char *scontext, u32 scontext_len,
+int security_context_to_sid_default(struct selinux_ns *ns,
+				    const char *scontext, u32 scontext_len,
 				    u32 *sid, u32 def_sid, gfp_t gfp_flags)
 {
-	return security_context_to_sid_core(scontext, scontext_len,
+	return security_context_to_sid_core(ns, scontext, scontext_len,
 					    sid, def_sid, gfp_flags, 1);
 }
 
-int security_context_to_sid_force(const char *scontext, u32 scontext_len,
+int security_context_to_sid_force(struct selinux_ns *ns,
+				  const char *scontext, u32 scontext_len,
 				  u32 *sid)
 {
-	return security_context_to_sid_core(scontext, scontext_len,
+	return security_context_to_sid_core(ns, scontext, scontext_len,
 					    sid, SECSID_NULL, GFP_KERNEL, 1);
 }
 
 static int compute_sid_handle_invalid_context(
+	struct selinux_ns *ns,
 	struct context *scontext,
 	struct context *tcontext,
 	u16 tclass,
 	struct context *newcontext)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	char *s = NULL, *t = NULL, *n = NULL;
 	u32 slen, tlen, nlen;
 
-	if (context_struct_to_string(scontext, &s, &slen))
+	if (context_struct_to_string(policydb, scontext, &s, &slen))
 		goto out;
-	if (context_struct_to_string(tcontext, &t, &tlen))
+	if (context_struct_to_string(policydb, tcontext, &t, &tlen))
 		goto out;
-	if (context_struct_to_string(newcontext, &n, &nlen))
+	if (context_struct_to_string(policydb, newcontext, &n, &nlen))
 		goto out;
 	audit_log(current->audit_context, GFP_ATOMIC, AUDIT_SELINUX_ERR,
 		  "op=security_compute_sid invalid_context=%s"
 		  " scontext=%s"
 		  " tcontext=%s"
 		  " tclass=%s",
-		  n, s, t, sym_name(&policydb, SYM_CLASSES, tclass-1));
+		  n, s, t, sym_name(policydb, SYM_CLASSES, tclass-1));
 out:
 	kfree(s);
 	kfree(t);
 	kfree(n);
-	if (!selinux_enforcing)
+	if (!ns_enforcing(ns))
 		return 0;
 	return -EACCES;
 }
 
-static void filename_compute_type(struct policydb *p, struct context *newcontext,
+static void filename_compute_type(struct policydb *policydb,
+				  struct context *newcontext,
 				  u32 stype, u32 ttype, u16 tclass,
 				  const char *objname)
 {
@@ -1559,7 +1626,7 @@ static void filename_compute_type(struct policydb *p, struct context *newcontext
 	 * like /dev or /var/run.  This bitmap will quickly skip rule searches
 	 * if the ttype does not contain any rules.
 	 */
-	if (!ebitmap_get_bit(&p->filename_trans_ttypes, ttype))
+	if (!ebitmap_get_bit(&policydb->filename_trans_ttypes, ttype))
 		return;
 
 	ft.stype = stype;
@@ -1567,12 +1634,13 @@ static void filename_compute_type(struct policydb *p, struct context *newcontext
 	ft.tclass = tclass;
 	ft.name = objname;
 
-	otype = hashtab_search(p->filename_trans, &ft);
+	otype = hashtab_search(policydb->filename_trans, &ft);
 	if (otype)
 		newcontext->type = otype->otype;
 }
 
-static int security_compute_sid(u32 ssid,
+static int security_compute_sid(struct selinux_ns *ns,
+				u32 ssid,
 				u32 tsid,
 				u16 orig_tclass,
 				u32 specified,
@@ -1580,6 +1648,8 @@ static int security_compute_sid(u32 ssid,
 				u32 *out_sid,
 				bool kern)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct class_datum *cladatum = NULL;
 	struct context *scontext = NULL, *tcontext = NULL, newcontext;
 	struct role_trans *roletr = NULL;
@@ -1590,7 +1660,7 @@ static int security_compute_sid(u32 ssid,
 	int rc = 0;
 	bool sock;
 
-	if (!ss_initialized) {
+	if (!ns->initialized) {
 		switch (orig_tclass) {
 		case SECCLASS_PROCESS: /* kernel value */
 			*out_sid = ssid;
@@ -1604,24 +1674,28 @@ static int security_compute_sid(u32 ssid,
 
 	context_init(&newcontext);
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	if (kern) {
-		tclass = unmap_class(orig_tclass);
+		tclass = unmap_class(&ns->ss->map, orig_tclass);
 		sock = security_is_socket_class(orig_tclass);
 	} else {
 		tclass = orig_tclass;
-		sock = security_is_socket_class(map_class(tclass));
+		sock = security_is_socket_class(map_class(&ns->ss->map,
+							  tclass));
 	}
 
-	scontext = sidtab_search(&sidtab, ssid);
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	scontext = sidtab_search(sidtab, ssid);
 	if (!scontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, ssid);
 		rc = -EINVAL;
 		goto out_unlock;
 	}
-	tcontext = sidtab_search(&sidtab, tsid);
+	tcontext = sidtab_search(sidtab, tsid);
 	if (!tcontext) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, tsid);
@@ -1629,8 +1703,8 @@ static int security_compute_sid(u32 ssid,
 		goto out_unlock;
 	}
 
-	if (tclass && tclass <= policydb.p_classes.nprim)
-		cladatum = policydb.class_val_to_struct[tclass - 1];
+	if (tclass && tclass <= policydb->p_classes.nprim)
+		cladatum = policydb->class_val_to_struct[tclass - 1];
 
 	/* Set the user identity. */
 	switch (specified) {
@@ -1656,7 +1730,7 @@ static int security_compute_sid(u32 ssid,
 	} else if (cladatum && cladatum->default_role == DEFAULT_TARGET) {
 		newcontext.role = tcontext->role;
 	} else {
-		if ((tclass == policydb.process_class) || (sock == true))
+		if ((tclass == policydb->process_class) || (sock == true))
 			newcontext.role = scontext->role;
 		else
 			newcontext.role = OBJECT_R_VAL;
@@ -1668,7 +1742,7 @@ static int security_compute_sid(u32 ssid,
 	} else if (cladatum && cladatum->default_type == DEFAULT_TARGET) {
 		newcontext.type = tcontext->type;
 	} else {
-		if ((tclass == policydb.process_class) || (sock == true)) {
+		if ((tclass == policydb->process_class) || (sock == true)) {
 			/* Use the type of process. */
 			newcontext.type = scontext->type;
 		} else {
@@ -1682,11 +1756,11 @@ static int security_compute_sid(u32 ssid,
 	avkey.target_type = tcontext->type;
 	avkey.target_class = tclass;
 	avkey.specified = specified;
-	avdatum = avtab_search(&policydb.te_avtab, &avkey);
+	avdatum = avtab_search(&policydb->te_avtab, &avkey);
 
 	/* If no permanent rule, also check for enabled conditional rules */
 	if (!avdatum) {
-		node = avtab_search_node(&policydb.te_cond_avtab, &avkey);
+		node = avtab_search_node(&policydb->te_cond_avtab, &avkey);
 		for (; node; node = avtab_search_node_next(node, specified)) {
 			if (node->key.specified & AVTAB_ENABLED) {
 				avdatum = &node->datum;
@@ -1702,13 +1776,14 @@ static int security_compute_sid(u32 ssid,
 
 	/* if we have a objname this is a file trans check so check those rules */
 	if (objname)
-		filename_compute_type(&policydb, &newcontext, scontext->type,
+		filename_compute_type(policydb, &newcontext, scontext->type,
 				      tcontext->type, tclass, objname);
 
 	/* Check for class-specific changes. */
 	if (specified & AVTAB_TRANSITION) {
 		/* Look for a role transition rule. */
-		for (roletr = policydb.role_tr; roletr; roletr = roletr->next) {
+		for (roletr = policydb->role_tr; roletr;
+		     roletr = roletr->next) {
 			if ((roletr->role == scontext->role) &&
 			    (roletr->type == tcontext->type) &&
 			    (roletr->tclass == tclass)) {
@@ -1721,14 +1796,14 @@ static int security_compute_sid(u32 ssid,
 
 	/* Set the MLS attributes.
 	   This is done last because it may allocate memory. */
-	rc = mls_compute_sid(scontext, tcontext, tclass, specified,
+	rc = mls_compute_sid(policydb, scontext, tcontext, tclass, specified,
 			     &newcontext, sock);
 	if (rc)
 		goto out_unlock;
 
 	/* Check the validity of the context. */
-	if (!policydb_context_isvalid(&policydb, &newcontext)) {
-		rc = compute_sid_handle_invalid_context(scontext,
+	if (!policydb_context_isvalid(policydb, &newcontext)) {
+		rc = compute_sid_handle_invalid_context(ns, scontext,
 							tcontext,
 							tclass,
 							&newcontext);
@@ -1736,9 +1811,9 @@ static int security_compute_sid(u32 ssid,
 			goto out_unlock;
 	}
 	/* Obtain the sid for the context. */
-	rc = sidtab_context_to_sid(&sidtab, &newcontext, out_sid);
+	rc = sidtab_context_to_sid(sidtab, &newcontext, out_sid);
 out_unlock:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	context_destroy(&newcontext);
 out:
 	return rc;
@@ -1757,17 +1832,21 @@ static int security_compute_sid(u32 ssid,
  * if insufficient memory is available, or %0 if the new SID was
  * computed successfully.
  */
-int security_transition_sid(u32 ssid, u32 tsid, u16 tclass,
+int security_transition_sid(struct selinux_ns *ns,
+			    u32 ssid, u32 tsid, u16 tclass,
 			    const struct qstr *qstr, u32 *out_sid)
 {
-	return security_compute_sid(ssid, tsid, tclass, AVTAB_TRANSITION,
+	return security_compute_sid(ns, ssid, tsid, tclass,
+				    AVTAB_TRANSITION,
 				    qstr ? qstr->name : NULL, out_sid, true);
 }
 
-int security_transition_sid_user(u32 ssid, u32 tsid, u16 tclass,
+int security_transition_sid_user(struct selinux_ns *ns,
+				 u32 ssid, u32 tsid, u16 tclass,
 				 const char *objname, u32 *out_sid)
 {
-	return security_compute_sid(ssid, tsid, tclass, AVTAB_TRANSITION,
+	return security_compute_sid(ns, ssid, tsid, tclass,
+				    AVTAB_TRANSITION,
 				    objname, out_sid, false);
 }
 
@@ -1784,12 +1863,14 @@ int security_transition_sid_user(u32 ssid, u32 tsid, u16 tclass,
  * if insufficient memory is available, or %0 if the SID was
  * computed successfully.
  */
-int security_member_sid(u32 ssid,
+int security_member_sid(struct selinux_ns *ns,
+			u32 ssid,
 			u32 tsid,
 			u16 tclass,
 			u32 *out_sid)
 {
-	return security_compute_sid(ssid, tsid, tclass, AVTAB_MEMBER, NULL,
+	return security_compute_sid(ns, ssid, tsid, tclass,
+				    AVTAB_MEMBER, NULL,
 				    out_sid, false);
 }
 
@@ -1806,12 +1887,14 @@ int security_member_sid(u32 ssid,
  * if insufficient memory is available, or %0 if the SID was
  * computed successfully.
  */
-int security_change_sid(u32 ssid,
+int security_change_sid(struct selinux_ns *ns,
+			u32 ssid,
 			u32 tsid,
 			u16 tclass,
 			u32 *out_sid)
 {
-	return security_compute_sid(ssid, tsid, tclass, AVTAB_CHANGE, NULL,
+	return security_compute_sid(ns,
+				    ssid, tsid, tclass, AVTAB_CHANGE, NULL,
 				    out_sid, false);
 }
 
@@ -1828,15 +1911,18 @@ static int clone_sid(u32 sid,
 		return 0;
 }
 
-static inline int convert_context_handle_invalid_context(struct context *context)
+static inline int convert_context_handle_invalid_context(
+	struct selinux_ns *ns,
+	struct context *context)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	char *s;
 	u32 len;
 
-	if (selinux_enforcing)
+	if (ns_enforcing(ns))
 		return -EINVAL;
 
-	if (!context_struct_to_string(context, &s, &len)) {
+	if (!context_struct_to_string(policydb, context, &s, &len)) {
 		printk(KERN_WARNING "SELinux:  Context %s would be invalid if enforcing\n", s);
 		kfree(s);
 	}
@@ -1844,6 +1930,7 @@ static inline int convert_context_handle_invalid_context(struct context *context
 }
 
 struct convert_context_args {
+	struct selinux_ns *ns;
 	struct policydb *oldp;
 	struct policydb *newp;
 };
@@ -1970,7 +2057,8 @@ static int convert_context(u32 key,
 
 	/* Check the validity of the new context. */
 	if (!policydb_context_isvalid(args->newp, c)) {
-		rc = convert_context_handle_invalid_context(&oldc);
+		rc = convert_context_handle_invalid_context(args->ns,
+							    &oldc);
 		if (rc)
 			goto bad;
 	}
@@ -1982,7 +2070,7 @@ static int convert_context(u32 key,
 	return rc;
 bad:
 	/* Map old representation to string and save it. */
-	rc = context_struct_to_string(&oldc, &s, &len);
+	rc = context_struct_to_string(args->oldp, &oldc, &s, &len);
 	if (rc)
 		return rc;
 	context_destroy(&oldc);
@@ -1995,39 +2083,29 @@ static int convert_context(u32 key,
 	goto out;
 }
 
-static void security_load_policycaps(void)
+static void security_load_policycaps(struct selinux_ns *ns)
 {
+	struct policydb *p = &ns->ss->policydb;
 	unsigned int i;
 	struct ebitmap_node *node;
 
-	selinux_policycap_netpeer = ebitmap_get_bit(&policydb.policycaps,
-						  POLICYDB_CAPABILITY_NETPEER);
-	selinux_policycap_openperm = ebitmap_get_bit(&policydb.policycaps,
-						  POLICYDB_CAPABILITY_OPENPERM);
-	selinux_policycap_extsockclass = ebitmap_get_bit(&policydb.policycaps,
-					  POLICYDB_CAPABILITY_EXTSOCKCLASS);
-	selinux_policycap_alwaysnetwork = ebitmap_get_bit(&policydb.policycaps,
-						  POLICYDB_CAPABILITY_ALWAYSNETWORK);
-	selinux_policycap_cgroupseclabel =
-		ebitmap_get_bit(&policydb.policycaps,
-				POLICYDB_CAPABILITY_CGROUPSECLABEL);
-	selinux_policycap_nnp_nosuid_transition =
-		ebitmap_get_bit(&policydb.policycaps,
-				POLICYDB_CAPABILITY_NNP_NOSUID_TRANSITION);
+	for (i = 0; i < ARRAY_SIZE(ns->policycap); i++)
+		ns->policycap[i] = ebitmap_get_bit(&p->policycaps, i);
 
 	for (i = 0; i < ARRAY_SIZE(selinux_policycap_names); i++)
 		pr_info("SELinux:  policy capability %s=%d\n",
 			selinux_policycap_names[i],
-			ebitmap_get_bit(&policydb.policycaps, i));
+			ebitmap_get_bit(&p->policycaps, i));
 
-	ebitmap_for_each_positive_bit(&policydb.policycaps, node, i) {
+	ebitmap_for_each_positive_bit(&p->policycaps, node, i) {
 		if (i >= ARRAY_SIZE(selinux_policycap_names))
 			pr_info("SELinux:  unknown policy capability %u\n",
 				i);
 	}
 }
 
-static int security_preserve_bools(struct policydb *p);
+static int security_preserve_bools(struct selinux_ns *ns,
+				   struct policydb *newpolicydb);
 
 /**
  * security_load_policy - Load a security policy configuration.
@@ -2039,14 +2117,16 @@ static int security_preserve_bools(struct policydb *p);
  * This function will flush the access vector cache after
  * loading the new policy.
  */
-int security_load_policy(void *data, size_t len)
+int security_load_policy(struct selinux_ns *ns, void *data, size_t len)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct policydb *oldpolicydb, *newpolicydb;
 	struct sidtab oldsidtab, newsidtab;
-	struct selinux_mapping *oldmap, *map = NULL;
+	struct selinux_mapping *oldmapping;
+	struct selinux_map newmap;
 	struct convert_context_args args;
 	u32 seqno;
-	u16 map_size;
 	int rc = 0;
 	struct policy_file file = { data, len }, *fp = &file;
 
@@ -2057,53 +2137,42 @@ int security_load_policy(void *data, size_t len)
 	}
 	newpolicydb = oldpolicydb + 1;
 
-	if (!ss_initialized) {
-		avtab_cache_init();
-		ebitmap_cache_init();
-		hashtab_cache_init();
-		rc = policydb_read(&policydb, fp);
-		if (rc) {
-			avtab_cache_destroy();
-			ebitmap_cache_destroy();
-			hashtab_cache_destroy();
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	if (!ns->initialized) {
+		rc = policydb_read(policydb, fp);
+		if (rc)
 			goto out;
-		}
 
-		policydb.len = len;
-		rc = selinux_set_mapping(&policydb, secclass_map,
-					 &current_mapping,
-					 &current_mapping_size);
+		policydb->len = len;
+		rc = selinux_set_mapping(policydb, secclass_map,
+					 &ns->ss->map);
 		if (rc) {
-			policydb_destroy(&policydb);
-			avtab_cache_destroy();
-			ebitmap_cache_destroy();
-			hashtab_cache_destroy();
+			policydb_destroy(policydb);
 			goto out;
 		}
 
-		rc = policydb_load_isids(&policydb, &sidtab);
+		rc = policydb_load_isids(policydb, sidtab);
 		if (rc) {
-			policydb_destroy(&policydb);
-			avtab_cache_destroy();
-			ebitmap_cache_destroy();
-			hashtab_cache_destroy();
+			policydb_destroy(policydb);
 			goto out;
 		}
 
-		security_load_policycaps();
-		ss_initialized = 1;
-		seqno = ++latest_granting;
+		security_load_policycaps(ns);
+		ns->initialized = 1;
+		seqno = ++ns->ss->latest_granting;
 		selinux_complete_init();
 		avc_ss_reset(seqno);
 		selnl_notify_policyload(seqno);
-		selinux_status_update_policyload(seqno);
+		selinux_status_update_policyload(ns, seqno);
 		selinux_netlbl_cache_invalidate();
 		selinux_xfrm_notify_policyload();
 		goto out;
 	}
 
 #if 0
-	sidtab_hash_eval(&sidtab, "sids");
+	sidtab_hash_eval(sidtab, "sids");
 #endif
 
 	rc = policydb_read(newpolicydb, fp);
@@ -2112,9 +2181,9 @@ int security_load_policy(void *data, size_t len)
 
 	newpolicydb->len = len;
 	/* If switching between different policy types, log MLS status */
-	if (policydb.mls_enabled && !newpolicydb->mls_enabled)
+	if (policydb->mls_enabled && !newpolicydb->mls_enabled)
 		printk(KERN_INFO "SELinux: Disabling MLS support...\n");
-	else if (!policydb.mls_enabled && newpolicydb->mls_enabled)
+	else if (!policydb->mls_enabled && newpolicydb->mls_enabled)
 		printk(KERN_INFO "SELinux: Enabling MLS support...\n");
 
 	rc = policydb_load_isids(newpolicydb, &newsidtab);
@@ -2124,20 +2193,20 @@ int security_load_policy(void *data, size_t len)
 		goto out;
 	}
 
-	rc = selinux_set_mapping(newpolicydb, secclass_map, &map, &map_size);
+	rc = selinux_set_mapping(newpolicydb, secclass_map, &newmap);
 	if (rc)
 		goto err;
 
-	rc = security_preserve_bools(newpolicydb);
+	rc = security_preserve_bools(ns, newpolicydb);
 	if (rc) {
 		printk(KERN_ERR "SELinux:  unable to preserve booleans\n");
 		goto err;
 	}
 
 	/* Clone the SID table. */
-	sidtab_shutdown(&sidtab);
+	sidtab_shutdown(sidtab);
 
-	rc = sidtab_map(&sidtab, clone_sid, &newsidtab);
+	rc = sidtab_map(sidtab, clone_sid, &newsidtab);
 	if (rc)
 		goto err;
 
@@ -2145,7 +2214,8 @@ int security_load_policy(void *data, size_t len)
 	 * Convert the internal representations of contexts
 	 * in the new SID table.
 	 */
-	args.oldp = &policydb;
+	args.ns = ns;
+	args.oldp = policydb;
 	args.newp = newpolicydb;
 	rc = sidtab_map(&newsidtab, convert_context, &args);
 	if (rc) {
@@ -2156,28 +2226,28 @@ int security_load_policy(void *data, size_t len)
 	}
 
 	/* Save the old policydb and SID table to free later. */
-	memcpy(oldpolicydb, &policydb, sizeof(policydb));
-	sidtab_set(&oldsidtab, &sidtab);
+	memcpy(oldpolicydb, policydb, sizeof(*policydb));
+	sidtab_set(&oldsidtab, sidtab);
 
 	/* Install the new policydb and SID table. */
-	write_lock_irq(&policy_rwlock);
-	memcpy(&policydb, newpolicydb, sizeof(policydb));
-	sidtab_set(&sidtab, &newsidtab);
-	security_load_policycaps();
-	oldmap = current_mapping;
-	current_mapping = map;
-	current_mapping_size = map_size;
-	seqno = ++latest_granting;
-	write_unlock_irq(&policy_rwlock);
+	write_lock_irq(&ns->ss->policy_rwlock);
+	memcpy(policydb, newpolicydb, sizeof(*policydb));
+	sidtab_set(sidtab, &newsidtab);
+	security_load_policycaps(ns);
+	oldmapping = ns->ss->map.mapping;
+	ns->ss->map.mapping = newmap.mapping;
+	ns->ss->map.size = newmap.size;
+	seqno = ++ns->ss->latest_granting;
+	write_unlock_irq(&ns->ss->policy_rwlock);
 
 	/* Free the old policydb and SID table. */
 	policydb_destroy(oldpolicydb);
 	sidtab_destroy(&oldsidtab);
-	kfree(oldmap);
+	kfree(oldmapping);
 
 	avc_ss_reset(seqno);
 	selnl_notify_policyload(seqno);
-	selinux_status_update_policyload(seqno);
+	selinux_status_update_policyload(ns, seqno);
 	selinux_netlbl_cache_invalidate();
 	selinux_xfrm_notify_policyload();
 
@@ -2185,7 +2255,7 @@ int security_load_policy(void *data, size_t len)
 	goto out;
 
 err:
-	kfree(map);
+	kfree(newmap.mapping);
 	sidtab_destroy(&newsidtab);
 	policydb_destroy(newpolicydb);
 
@@ -2194,13 +2264,14 @@ int security_load_policy(void *data, size_t len)
 	return rc;
 }
 
-size_t security_policydb_len(void)
+size_t security_policydb_len(struct selinux_ns *ns)
 {
+	struct policydb *p = &ns->ss->policydb;
 	size_t len;
 
-	read_lock(&policy_rwlock);
-	len = policydb.len;
-	read_unlock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+	len = p->len;
+	read_unlock(&ns->ss->policy_rwlock);
 
 	return len;
 }
@@ -2211,14 +2282,20 @@ size_t security_policydb_len(void)
  * @port: port number
  * @out_sid: security identifier
  */
-int security_port_sid(u8 protocol, u16 port, u32 *out_sid)
+int security_port_sid(struct selinux_ns *ns,
+		      u8 protocol, u16 port, u32 *out_sid)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct ocontext *c;
 	int rc = 0;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 
-	c = policydb.ocontexts[OCON_PORT];
+	c = policydb->ocontexts[OCON_PORT];
 	while (c) {
 		if (c->u.port.protocol == protocol &&
 		    c->u.port.low_port <= port &&
@@ -2229,7 +2306,7 @@ int security_port_sid(u8 protocol, u16 port, u32 *out_sid)
 
 	if (c) {
 		if (!c->sid[0]) {
-			rc = sidtab_context_to_sid(&sidtab,
+			rc = sidtab_context_to_sid(sidtab,
 						   &c->context[0],
 						   &c->sid[0]);
 			if (rc)
@@ -2241,7 +2318,7 @@ int security_port_sid(u8 protocol, u16 port, u32 *out_sid)
 	}
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -2251,14 +2328,20 @@ int security_port_sid(u8 protocol, u16 port, u32 *out_sid)
  * @pkey_num: pkey number
  * @out_sid: security identifier
  */
-int security_ib_pkey_sid(u64 subnet_prefix, u16 pkey_num, u32 *out_sid)
+int security_ib_pkey_sid(struct selinux_ns *ns,
+			 u64 subnet_prefix, u16 pkey_num, u32 *out_sid)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct ocontext *c;
 	int rc = 0;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
-	c = policydb.ocontexts[OCON_IBPKEY];
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	c = policydb->ocontexts[OCON_IBPKEY];
 	while (c) {
 		if (c->u.ibpkey.low_pkey <= pkey_num &&
 		    c->u.ibpkey.high_pkey >= pkey_num &&
@@ -2270,7 +2353,7 @@ int security_ib_pkey_sid(u64 subnet_prefix, u16 pkey_num, u32 *out_sid)
 
 	if (c) {
 		if (!c->sid[0]) {
-			rc = sidtab_context_to_sid(&sidtab,
+			rc = sidtab_context_to_sid(sidtab,
 						   &c->context[0],
 						   &c->sid[0]);
 			if (rc)
@@ -2281,7 +2364,7 @@ int security_ib_pkey_sid(u64 subnet_prefix, u16 pkey_num, u32 *out_sid)
 		*out_sid = SECINITSID_UNLABELED;
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -2291,14 +2374,20 @@ int security_ib_pkey_sid(u64 subnet_prefix, u16 pkey_num, u32 *out_sid)
  * @port: port number
  * @out_sid: security identifier
  */
-int security_ib_endport_sid(const char *dev_name, u8 port_num, u32 *out_sid)
+int security_ib_endport_sid(struct selinux_ns *ns,
+			    const char *dev_name, u8 port_num, u32 *out_sid)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct ocontext *c;
 	int rc = 0;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 
-	c = policydb.ocontexts[OCON_IBENDPORT];
+	c = policydb->ocontexts[OCON_IBENDPORT];
 	while (c) {
 		if (c->u.ibendport.port == port_num &&
 		    !strncmp(c->u.ibendport.dev_name,
@@ -2311,7 +2400,7 @@ int security_ib_endport_sid(const char *dev_name, u8 port_num, u32 *out_sid)
 
 	if (c) {
 		if (!c->sid[0]) {
-			rc = sidtab_context_to_sid(&sidtab,
+			rc = sidtab_context_to_sid(sidtab,
 						   &c->context[0],
 						   &c->sid[0]);
 			if (rc)
@@ -2322,7 +2411,7 @@ int security_ib_endport_sid(const char *dev_name, u8 port_num, u32 *out_sid)
 		*out_sid = SECINITSID_UNLABELED;
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -2331,14 +2420,20 @@ int security_ib_endport_sid(const char *dev_name, u8 port_num, u32 *out_sid)
  * @name: interface name
  * @if_sid: interface SID
  */
-int security_netif_sid(char *name, u32 *if_sid)
+int security_netif_sid(struct selinux_ns *ns,
+		       char *name, u32 *if_sid)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	int rc = 0;
 	struct ocontext *c;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
-	c = policydb.ocontexts[OCON_NETIF];
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	c = policydb->ocontexts[OCON_NETIF];
 	while (c) {
 		if (strcmp(name, c->u.name) == 0)
 			break;
@@ -2347,12 +2442,12 @@ int security_netif_sid(char *name, u32 *if_sid)
 
 	if (c) {
 		if (!c->sid[0] || !c->sid[1]) {
-			rc = sidtab_context_to_sid(&sidtab,
+			rc = sidtab_context_to_sid(sidtab,
 						  &c->context[0],
 						  &c->sid[0]);
 			if (rc)
 				goto out;
-			rc = sidtab_context_to_sid(&sidtab,
+			rc = sidtab_context_to_sid(sidtab,
 						   &c->context[1],
 						   &c->sid[1]);
 			if (rc)
@@ -2363,7 +2458,7 @@ int security_netif_sid(char *name, u32 *if_sid)
 		*if_sid = SECINITSID_NETIF;
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -2387,15 +2482,21 @@ static int match_ipv6_addrmask(u32 *input, u32 *addr, u32 *mask)
  * @addrlen: address length in bytes
  * @out_sid: security identifier
  */
-int security_node_sid(u16 domain,
+int security_node_sid(struct selinux_ns *ns,
+		      u16 domain,
 		      void *addrp,
 		      u32 addrlen,
 		      u32 *out_sid)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	int rc;
 	struct ocontext *c;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 
 	switch (domain) {
 	case AF_INET: {
@@ -2407,7 +2508,7 @@ int security_node_sid(u16 domain,
 
 		addr = *((u32 *)addrp);
 
-		c = policydb.ocontexts[OCON_NODE];
+		c = policydb->ocontexts[OCON_NODE];
 		while (c) {
 			if (c->u.node.addr == (addr & c->u.node.mask))
 				break;
@@ -2420,7 +2521,7 @@ int security_node_sid(u16 domain,
 		rc = -EINVAL;
 		if (addrlen != sizeof(u64) * 2)
 			goto out;
-		c = policydb.ocontexts[OCON_NODE6];
+		c = policydb->ocontexts[OCON_NODE6];
 		while (c) {
 			if (match_ipv6_addrmask(addrp, c->u.node6.addr,
 						c->u.node6.mask))
@@ -2437,7 +2538,7 @@ int security_node_sid(u16 domain,
 
 	if (c) {
 		if (!c->sid[0]) {
-			rc = sidtab_context_to_sid(&sidtab,
+			rc = sidtab_context_to_sid(sidtab,
 						   &c->context[0],
 						   &c->sid[0]);
 			if (rc)
@@ -2450,7 +2551,7 @@ int security_node_sid(u16 domain,
 
 	rc = 0;
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -2470,11 +2571,14 @@ int security_node_sid(u16 domain,
  * number of elements in the array.
  */
 
-int security_get_user_sids(u32 fromsid,
+int security_get_user_sids(struct selinux_ns *ns,
+			   u32 fromsid,
 			   char *username,
 			   u32 **sids,
 			   u32 *nel)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	struct context *fromcon, usercon;
 	u32 *mysids = NULL, *mysids2, sid;
 	u32 mynel = 0, maxnel = SIDS_NEL;
@@ -2486,20 +2590,23 @@ int security_get_user_sids(u32 fromsid,
 	*sids = NULL;
 	*nel = 0;
 
-	if (!ss_initialized)
+	if (!ns->initialized)
 		goto out;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
 
 	context_init(&usercon);
 
 	rc = -EINVAL;
-	fromcon = sidtab_search(&sidtab, fromsid);
+	fromcon = sidtab_search(sidtab, fromsid);
 	if (!fromcon)
 		goto out_unlock;
 
 	rc = -EINVAL;
-	user = hashtab_search(policydb.p_users.table, username);
+	user = hashtab_search(policydb->p_users.table, username);
 	if (!user)
 		goto out_unlock;
 
@@ -2511,15 +2618,16 @@ int security_get_user_sids(u32 fromsid,
 		goto out_unlock;
 
 	ebitmap_for_each_positive_bit(&user->roles, rnode, i) {
-		role = policydb.role_val_to_struct[i];
+		role = policydb->role_val_to_struct[i];
 		usercon.role = i + 1;
 		ebitmap_for_each_positive_bit(&role->types, tnode, j) {
 			usercon.type = j + 1;
 
-			if (mls_setup_user_range(fromcon, user, &usercon))
+			if (mls_setup_user_range(policydb, fromcon, user,
+						 &usercon))
 				continue;
 
-			rc = sidtab_context_to_sid(&sidtab, &usercon, &sid);
+			rc = sidtab_context_to_sid(sidtab, &usercon, &sid);
 			if (rc)
 				goto out_unlock;
 			if (mynel < maxnel) {
@@ -2539,7 +2647,7 @@ int security_get_user_sids(u32 fromsid,
 	}
 	rc = 0;
 out_unlock:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	if (rc || !mynel) {
 		kfree(mysids);
 		goto out;
@@ -2582,11 +2690,14 @@ int security_get_user_sids(u32 fromsid,
  *
  * The caller must acquire the policy_rwlock before calling this function.
  */
-static inline int __security_genfs_sid(const char *fstype,
+static inline int __security_genfs_sid(struct selinux_ns *ns,
+				       const char *fstype,
 				       char *path,
 				       u16 orig_sclass,
 				       u32 *sid)
 {
+	struct policydb *policydb = &ns->ss->policydb;
+	struct sidtab *sidtab = &ns->ss->sidtab;
 	int len;
 	u16 sclass;
 	struct genfs *genfs;
@@ -2596,10 +2707,10 @@ static inline int __security_genfs_sid(const char *fstype,
 	while (path[0] == '/' && path[1] == '/')
 		path++;
 
-	sclass = unmap_class(orig_sclass);
+	sclass = unmap_class(&ns->ss->map, orig_sclass);
 	*sid = SECINITSID_UNLABELED;
 
-	for (genfs = policydb.genfs; genfs; genfs = genfs->next) {
+	for (genfs = policydb->genfs; genfs; genfs = genfs->next) {
 		cmp = strcmp(fstype, genfs->fstype);
 		if (cmp <= 0)
 			break;
@@ -2621,7 +2732,7 @@ static inline int __security_genfs_sid(const char *fstype,
 		goto out;
 
 	if (!c->sid[0]) {
-		rc = sidtab_context_to_sid(&sidtab, &c->context[0], &c->sid[0]);
+		rc = sidtab_context_to_sid(sidtab, &c->context[0], &c->sid[0]);
 		if (rc)
 			goto out;
 	}
@@ -2642,16 +2753,17 @@ static inline int __security_genfs_sid(const char *fstype,
  * Acquire policy_rwlock before calling __security_genfs_sid() and release
  * it afterward.
  */
-int security_genfs_sid(const char *fstype,
+int security_genfs_sid(struct selinux_ns *ns,
+		       const char *fstype,
 		       char *path,
 		       u16 orig_sclass,
 		       u32 *sid)
 {
 	int retval;
 
-	read_lock(&policy_rwlock);
-	retval = __security_genfs_sid(fstype, path, orig_sclass, sid);
-	read_unlock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+	retval = __security_genfs_sid(ns, fstype, path, orig_sclass, sid);
+	read_unlock(&ns->ss->policy_rwlock);
 	return retval;
 }
 
@@ -2659,16 +2771,21 @@ int security_genfs_sid(const char *fstype,
  * security_fs_use - Determine how to handle labeling for a filesystem.
  * @sb: superblock in question
  */
-int security_fs_use(struct super_block *sb)
+int security_fs_use(struct selinux_ns *ns, struct super_block *sb)
 {
+	struct policydb *policydb;
+	struct sidtab *sidtab;
 	int rc = 0;
 	struct ocontext *c;
 	struct superblock_security_struct *sbsec = sb->s_security;
 	const char *fstype = sb->s_type->name;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
-	c = policydb.ocontexts[OCON_FSUSE];
+	policydb = &ns->ss->policydb;
+	sidtab = &ns->ss->sidtab;
+
+	c = policydb->ocontexts[OCON_FSUSE];
 	while (c) {
 		if (strcmp(fstype, c->u.name) == 0)
 			break;
@@ -2678,14 +2795,14 @@ int security_fs_use(struct super_block *sb)
 	if (c) {
 		sbsec->behavior = c->v.behavior;
 		if (!c->sid[0]) {
-			rc = sidtab_context_to_sid(&sidtab, &c->context[0],
+			rc = sidtab_context_to_sid(sidtab, &c->context[0],
 						   &c->sid[0]);
 			if (rc)
 				goto out;
 		}
 		sbsec->sid = c->sid[0];
 	} else {
-		rc = __security_genfs_sid(fstype, "/", SECCLASS_DIR,
+		rc = __security_genfs_sid(ns, fstype, "/", SECCLASS_DIR,
 					  &sbsec->sid);
 		if (rc) {
 			sbsec->behavior = SECURITY_FS_USE_NONE;
@@ -2696,20 +2813,25 @@ int security_fs_use(struct super_block *sb)
 	}
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
-int security_get_bools(int *len, char ***names, int **values)
+int security_get_bools(struct selinux_ns *ns,
+		       int *len, char ***names, int **values)
 {
+	struct policydb *policydb;
 	int i, rc;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
+
 	*names = NULL;
 	*values = NULL;
 
 	rc = 0;
-	*len = policydb.p_bools.nprim;
+	*len = policydb->p_bools.nprim;
 	if (!*len)
 		goto out;
 
@@ -2724,16 +2846,17 @@ int security_get_bools(int *len, char ***names, int **values)
 		goto err;
 
 	for (i = 0; i < *len; i++) {
-		(*values)[i] = policydb.bool_val_to_struct[i]->state;
+		(*values)[i] = policydb->bool_val_to_struct[i]->state;
 
 		rc = -ENOMEM;
-		(*names)[i] = kstrdup(sym_name(&policydb, SYM_BOOLS, i), GFP_ATOMIC);
+		(*names)[i] = kstrdup(sym_name(policydb, SYM_BOOLS, i),
+				      GFP_ATOMIC);
 		if (!(*names)[i])
 			goto err;
 	}
 	rc = 0;
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 err:
 	if (*names) {
@@ -2745,90 +2868,98 @@ int security_get_bools(int *len, char ***names, int **values)
 }
 
 
-int security_set_bools(int len, int *values)
+int security_set_bools(struct selinux_ns *ns, int len, int *values)
 {
+	struct policydb *policydb;
 	int i, rc;
 	int lenp, seqno = 0;
 	struct cond_node *cur;
 
-	write_lock_irq(&policy_rwlock);
+	write_lock_irq(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
 
 	rc = -EFAULT;
-	lenp = policydb.p_bools.nprim;
+	lenp = policydb->p_bools.nprim;
 	if (len != lenp)
 		goto out;
 
 	for (i = 0; i < len; i++) {
-		if (!!values[i] != policydb.bool_val_to_struct[i]->state) {
+		if (!!values[i] != policydb->bool_val_to_struct[i]->state) {
 			audit_log(current->audit_context, GFP_ATOMIC,
 				AUDIT_MAC_CONFIG_CHANGE,
 				"bool=%s val=%d old_val=%d auid=%u ses=%u",
-				sym_name(&policydb, SYM_BOOLS, i),
+				sym_name(policydb, SYM_BOOLS, i),
 				!!values[i],
-				policydb.bool_val_to_struct[i]->state,
+				policydb->bool_val_to_struct[i]->state,
 				from_kuid(&init_user_ns, audit_get_loginuid(current)),
 				audit_get_sessionid(current));
 		}
 		if (values[i])
-			policydb.bool_val_to_struct[i]->state = 1;
+			policydb->bool_val_to_struct[i]->state = 1;
 		else
-			policydb.bool_val_to_struct[i]->state = 0;
+			policydb->bool_val_to_struct[i]->state = 0;
 	}
 
-	for (cur = policydb.cond_list; cur; cur = cur->next) {
-		rc = evaluate_cond_node(&policydb, cur);
+	for (cur = policydb->cond_list; cur; cur = cur->next) {
+		rc = evaluate_cond_node(policydb, cur);
 		if (rc)
 			goto out;
 	}
 
-	seqno = ++latest_granting;
+	seqno = ++ns->ss->latest_granting;
 	rc = 0;
 out:
-	write_unlock_irq(&policy_rwlock);
+	write_unlock_irq(&ns->ss->policy_rwlock);
 	if (!rc) {
 		avc_ss_reset(seqno);
 		selnl_notify_policyload(seqno);
-		selinux_status_update_policyload(seqno);
+		selinux_status_update_policyload(ns, seqno);
 		selinux_xfrm_notify_policyload();
 	}
 	return rc;
 }
 
-int security_get_bool_value(int index)
+int security_get_bool_value(struct selinux_ns *ns,
+			    int index)
 {
+	struct policydb *policydb;
 	int rc;
 	int len;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+
+	policydb = &ns->ss->policydb;
 
 	rc = -EFAULT;
-	len = policydb.p_bools.nprim;
+	len = policydb->p_bools.nprim;
 	if (index >= len)
 		goto out;
 
-	rc = policydb.bool_val_to_struct[index]->state;
+	rc = policydb->bool_val_to_struct[index]->state;
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
-static int security_preserve_bools(struct policydb *p)
+static int security_preserve_bools(struct selinux_ns *ns,
+				   struct policydb *policydb)
 {
 	int rc, nbools = 0, *bvalues = NULL, i;
 	char **bnames = NULL;
 	struct cond_bool_datum *booldatum;
 	struct cond_node *cur;
 
-	rc = security_get_bools(&nbools, &bnames, &bvalues);
+	rc = security_get_bools(ns, &nbools, &bnames, &bvalues);
 	if (rc)
 		goto out;
 	for (i = 0; i < nbools; i++) {
-		booldatum = hashtab_search(p->p_bools.table, bnames[i]);
+		booldatum = hashtab_search(policydb->p_bools.table, bnames[i]);
 		if (booldatum)
 			booldatum->state = bvalues[i];
 	}
-	for (cur = p->cond_list; cur; cur = cur->next) {
-		rc = evaluate_cond_node(p, cur);
+	for (cur = policydb->cond_list; cur; cur = cur->next) {
+		rc = evaluate_cond_node(policydb, cur);
 		if (rc)
 			goto out;
 	}
@@ -2847,8 +2978,11 @@ static int security_preserve_bools(struct policydb *p)
  * security_sid_mls_copy() - computes a new sid based on the given
  * sid and the mls portion of mls_sid.
  */
-int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid)
+int security_sid_mls_copy(struct selinux_ns *ns,
+			  u32 sid, u32 mls_sid, u32 *new_sid)
 {
+	struct policydb *policydb = &ns->ss->policydb;
+	struct sidtab *sidtab = &ns->ss->sidtab;
 	struct context *context1;
 	struct context *context2;
 	struct context newcon;
@@ -2857,17 +2991,17 @@ int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid)
 	int rc;
 
 	rc = 0;
-	if (!ss_initialized || !policydb.mls_enabled) {
+	if (!ns->initialized || !policydb->mls_enabled) {
 		*new_sid = sid;
 		goto out;
 	}
 
 	context_init(&newcon);
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	rc = -EINVAL;
-	context1 = sidtab_search(&sidtab, sid);
+	context1 = sidtab_search(sidtab, sid);
 	if (!context1) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 			__func__, sid);
@@ -2875,7 +3009,7 @@ int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid)
 	}
 
 	rc = -EINVAL;
-	context2 = sidtab_search(&sidtab, mls_sid);
+	context2 = sidtab_search(sidtab, mls_sid);
 	if (!context2) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 			__func__, mls_sid);
@@ -2890,10 +3024,11 @@ int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid)
 		goto out_unlock;
 
 	/* Check the validity of the new context. */
-	if (!policydb_context_isvalid(&policydb, &newcon)) {
-		rc = convert_context_handle_invalid_context(&newcon);
+	if (!policydb_context_isvalid(policydb, &newcon)) {
+		rc = convert_context_handle_invalid_context(ns, &newcon);
 		if (rc) {
-			if (!context_struct_to_string(&newcon, &s, &len)) {
+			if (!context_struct_to_string(policydb, &newcon, &s,
+						      &len)) {
 				audit_log(current->audit_context,
 					  GFP_ATOMIC, AUDIT_SELINUX_ERR,
 					  "op=security_sid_mls_copy "
@@ -2904,9 +3039,9 @@ int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid)
 		}
 	}
 
-	rc = sidtab_context_to_sid(&sidtab, &newcon, new_sid);
+	rc = sidtab_context_to_sid(sidtab, &newcon, new_sid);
 out_unlock:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	context_destroy(&newcon);
 out:
 	return rc;
@@ -2932,10 +3067,13 @@ int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid)
  *   multiple, inconsistent labels |    -<errno>     |    SECSID_NULL
  *
  */
-int security_net_peersid_resolve(u32 nlbl_sid, u32 nlbl_type,
+int security_net_peersid_resolve(struct selinux_ns *ns,
+				 u32 nlbl_sid, u32 nlbl_type,
 				 u32 xfrm_sid,
 				 u32 *peer_sid)
 {
+	struct policydb *policydb = &ns->ss->policydb;
+	struct sidtab *sidtab = &ns->ss->sidtab;
 	int rc;
 	struct context *nlbl_ctx;
 	struct context *xfrm_ctx;
@@ -2957,23 +3095,25 @@ int security_net_peersid_resolve(u32 nlbl_sid, u32 nlbl_type,
 		return 0;
 	}
 
-	/* we don't need to check ss_initialized here since the only way both
+	/*
+	 * We don't need to check ns->initialized here since the only way both
 	 * nlbl_sid and xfrm_sid are not equal to SECSID_NULL would be if the
-	 * security server was initialized and ss_initialized was true */
-	if (!policydb.mls_enabled)
+	 * security server was initialized and ns->initialized was true.
+	 */
+	if (!policydb->mls_enabled)
 		return 0;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	rc = -EINVAL;
-	nlbl_ctx = sidtab_search(&sidtab, nlbl_sid);
+	nlbl_ctx = sidtab_search(sidtab, nlbl_sid);
 	if (!nlbl_ctx) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, nlbl_sid);
 		goto out;
 	}
 	rc = -EINVAL;
-	xfrm_ctx = sidtab_search(&sidtab, xfrm_sid);
+	xfrm_ctx = sidtab_search(sidtab, xfrm_sid);
 	if (!xfrm_ctx) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized SID %d\n",
 		       __func__, xfrm_sid);
@@ -2990,7 +3130,7 @@ int security_net_peersid_resolve(u32 nlbl_sid, u32 nlbl_type,
 	 * expressive */
 	*peer_sid = xfrm_sid;
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -3007,19 +3147,21 @@ static int get_classes_callback(void *k, void *d, void *args)
 	return 0;
 }
 
-int security_get_classes(char ***classes, int *nclasses)
+int security_get_classes(struct selinux_ns *ns,
+			 char ***classes, int *nclasses)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	int rc;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	rc = -ENOMEM;
-	*nclasses = policydb.p_classes.nprim;
+	*nclasses = policydb->p_classes.nprim;
 	*classes = kcalloc(*nclasses, sizeof(**classes), GFP_ATOMIC);
 	if (!*classes)
 		goto out;
 
-	rc = hashtab_map(policydb.p_classes.table, get_classes_callback,
+	rc = hashtab_map(policydb->p_classes.table, get_classes_callback,
 			*classes);
 	if (rc) {
 		int i;
@@ -3029,7 +3171,7 @@ int security_get_classes(char ***classes, int *nclasses)
 	}
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -3046,15 +3188,17 @@ static int get_permissions_callback(void *k, void *d, void *args)
 	return 0;
 }
 
-int security_get_permissions(char *class, char ***perms, int *nperms)
+int security_get_permissions(struct selinux_ns *ns,
+			     char *class, char ***perms, int *nperms)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	int rc, i;
 	struct class_datum *match;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	rc = -EINVAL;
-	match = hashtab_search(policydb.p_classes.table, class);
+	match = hashtab_search(policydb->p_classes.table, class);
 	if (!match) {
 		printk(KERN_ERR "SELinux: %s:  unrecognized class %s\n",
 			__func__, class);
@@ -3080,25 +3224,25 @@ int security_get_permissions(char *class, char ***perms, int *nperms)
 		goto err;
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 
 err:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	for (i = 0; i < *nperms; i++)
 		kfree((*perms)[i]);
 	kfree(*perms);
 	return rc;
 }
 
-int security_get_reject_unknown(void)
+int security_get_reject_unknown(struct selinux_ns *ns)
 {
-	return policydb.reject_unknown;
+	return ns->ss->policydb.reject_unknown;
 }
 
-int security_get_allow_unknown(void)
+int security_get_allow_unknown(struct selinux_ns *ns)
 {
-	return policydb.allow_unknown;
+	return ns->ss->policydb.allow_unknown;
 }
 
 /**
@@ -3111,13 +3255,15 @@ int security_get_allow_unknown(void)
  * supported, false (0) if it isn't supported.
  *
  */
-int security_policycap_supported(unsigned int req_cap)
+int security_policycap_supported(struct selinux_ns *ns,
+				 unsigned int req_cap)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	int rc;
 
-	read_lock(&policy_rwlock);
-	rc = ebitmap_get_bit(&policydb.policycaps, req_cap);
-	read_unlock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+	rc = ebitmap_get_bit(&policydb->policycaps, req_cap);
+	read_unlock(&ns->ss->policy_rwlock);
 
 	return rc;
 }
@@ -3139,6 +3285,8 @@ void selinux_audit_rule_free(void *vrule)
 
 int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
 {
+	struct selinux_ns *ns = current_selinux_ns;
+	struct policydb *policydb = &ns->ss->policydb;
 	struct selinux_audit_rule *tmprule;
 	struct role_datum *roledatum;
 	struct type_datum *typedatum;
@@ -3148,7 +3296,7 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
 
 	*rule = NULL;
 
-	if (!ss_initialized)
+	if (!ns->initialized)
 		return -EOPNOTSUPP;
 
 	switch (field) {
@@ -3181,15 +3329,15 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
 
 	context_init(&tmprule->au_ctxt);
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
-	tmprule->au_seqno = latest_granting;
+	tmprule->au_seqno = ns->ss->latest_granting;
 
 	switch (field) {
 	case AUDIT_SUBJ_USER:
 	case AUDIT_OBJ_USER:
 		rc = -EINVAL;
-		userdatum = hashtab_search(policydb.p_users.table, rulestr);
+		userdatum = hashtab_search(policydb->p_users.table, rulestr);
 		if (!userdatum)
 			goto out;
 		tmprule->au_ctxt.user = userdatum->value;
@@ -3197,7 +3345,7 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
 	case AUDIT_SUBJ_ROLE:
 	case AUDIT_OBJ_ROLE:
 		rc = -EINVAL;
-		roledatum = hashtab_search(policydb.p_roles.table, rulestr);
+		roledatum = hashtab_search(policydb->p_roles.table, rulestr);
 		if (!roledatum)
 			goto out;
 		tmprule->au_ctxt.role = roledatum->value;
@@ -3205,7 +3353,7 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
 	case AUDIT_SUBJ_TYPE:
 	case AUDIT_OBJ_TYPE:
 		rc = -EINVAL;
-		typedatum = hashtab_search(policydb.p_types.table, rulestr);
+		typedatum = hashtab_search(policydb->p_types.table, rulestr);
 		if (!typedatum)
 			goto out;
 		tmprule->au_ctxt.type = typedatum->value;
@@ -3214,14 +3362,15 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
 	case AUDIT_SUBJ_CLR:
 	case AUDIT_OBJ_LEV_LOW:
 	case AUDIT_OBJ_LEV_HIGH:
-		rc = mls_from_string(rulestr, &tmprule->au_ctxt, GFP_ATOMIC);
+		rc = mls_from_string(policydb, rulestr, &tmprule->au_ctxt,
+				     GFP_ATOMIC);
 		if (rc)
 			goto out;
 		break;
 	}
 	rc = 0;
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 
 	if (rc) {
 		selinux_audit_rule_free(tmprule);
@@ -3261,6 +3410,7 @@ int selinux_audit_rule_known(struct audit_krule *rule)
 int selinux_audit_rule_match(u32 sid, u32 field, u32 op, void *vrule,
 			     struct audit_context *actx)
 {
+	struct selinux_ns *ns = current_selinux_ns;
 	struct context *ctxt;
 	struct mls_level *level;
 	struct selinux_audit_rule *rule = vrule;
@@ -3271,14 +3421,14 @@ int selinux_audit_rule_match(u32 sid, u32 field, u32 op, void *vrule,
 		return -ENOENT;
 	}
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
-	if (rule->au_seqno < latest_granting) {
+	if (rule->au_seqno < ns->ss->latest_granting) {
 		match = -ESTALE;
 		goto out;
 	}
 
-	ctxt = sidtab_search(&sidtab, sid);
+	ctxt = sidtab_search(&ns->ss->sidtab, sid);
 	if (unlikely(!ctxt)) {
 		WARN_ONCE(1, "selinux_audit_rule_match: unrecognized SID %d\n",
 			  sid);
@@ -3362,7 +3512,7 @@ int selinux_audit_rule_match(u32 sid, u32 field, u32 op, void *vrule,
 	}
 
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return match;
 }
 
@@ -3436,19 +3586,22 @@ static void security_netlbl_cache_add(struct netlbl_lsm_secattr *secattr,
  * failure.
  *
  */
-int security_netlbl_secattr_to_sid(struct netlbl_lsm_secattr *secattr,
+int security_netlbl_secattr_to_sid(struct selinux_ns *ns,
+				   struct netlbl_lsm_secattr *secattr,
 				   u32 *sid)
 {
+	struct policydb *policydb = &ns->ss->policydb;
+	struct sidtab *sidtab = &ns->ss->sidtab;
 	int rc;
 	struct context *ctx;
 	struct context ctx_new;
 
-	if (!ss_initialized) {
+	if (!ns->initialized) {
 		*sid = SECSID_NULL;
 		return 0;
 	}
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	if (secattr->flags & NETLBL_SECATTR_CACHE)
 		*sid = *(u32 *)secattr->cache->data;
@@ -3456,7 +3609,7 @@ int security_netlbl_secattr_to_sid(struct netlbl_lsm_secattr *secattr,
 		*sid = secattr->attr.secid;
 	else if (secattr->flags & NETLBL_SECATTR_MLS_LVL) {
 		rc = -EIDRM;
-		ctx = sidtab_search(&sidtab, SECINITSID_NETMSG);
+		ctx = sidtab_search(sidtab, SECINITSID_NETMSG);
 		if (ctx == NULL)
 			goto out;
 
@@ -3464,17 +3617,17 @@ int security_netlbl_secattr_to_sid(struct netlbl_lsm_secattr *secattr,
 		ctx_new.user = ctx->user;
 		ctx_new.role = ctx->role;
 		ctx_new.type = ctx->type;
-		mls_import_netlbl_lvl(&ctx_new, secattr);
+		mls_import_netlbl_lvl(policydb, &ctx_new, secattr);
 		if (secattr->flags & NETLBL_SECATTR_MLS_CAT) {
-			rc = mls_import_netlbl_cat(&ctx_new, secattr);
+			rc = mls_import_netlbl_cat(policydb, &ctx_new, secattr);
 			if (rc)
 				goto out;
 		}
 		rc = -EIDRM;
-		if (!mls_context_isvalid(&policydb, &ctx_new))
+		if (!mls_context_isvalid(policydb, &ctx_new))
 			goto out_free;
 
-		rc = sidtab_context_to_sid(&sidtab, &ctx_new, sid);
+		rc = sidtab_context_to_sid(sidtab, &ctx_new, sid);
 		if (rc)
 			goto out_free;
 
@@ -3484,12 +3637,12 @@ int security_netlbl_secattr_to_sid(struct netlbl_lsm_secattr *secattr,
 	} else
 		*sid = SECSID_NULL;
 
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return 0;
 out_free:
 	ebitmap_destroy(&ctx_new.range.level[0].cat);
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 
@@ -3503,33 +3656,35 @@ int security_netlbl_secattr_to_sid(struct netlbl_lsm_secattr *secattr,
  * Returns zero on success, negative values on failure.
  *
  */
-int security_netlbl_sid_to_secattr(u32 sid, struct netlbl_lsm_secattr *secattr)
+int security_netlbl_sid_to_secattr(struct selinux_ns *ns,
+				   u32 sid, struct netlbl_lsm_secattr *secattr)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	int rc;
 	struct context *ctx;
 
-	if (!ss_initialized)
+	if (!ns->initialized)
 		return 0;
 
-	read_lock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
 
 	rc = -ENOENT;
-	ctx = sidtab_search(&sidtab, sid);
+	ctx = sidtab_search(&ns->ss->sidtab, sid);
 	if (ctx == NULL)
 		goto out;
 
 	rc = -ENOMEM;
-	secattr->domain = kstrdup(sym_name(&policydb, SYM_TYPES, ctx->type - 1),
+	secattr->domain = kstrdup(sym_name(policydb, SYM_TYPES, ctx->type - 1),
 				  GFP_ATOMIC);
 	if (secattr->domain == NULL)
 		goto out;
 
 	secattr->attr.secid = sid;
 	secattr->flags |= NETLBL_SECATTR_DOMAIN_CPY | NETLBL_SECATTR_SECID;
-	mls_export_netlbl_lvl(ctx, secattr);
-	rc = mls_export_netlbl_cat(ctx, secattr);
+	mls_export_netlbl_lvl(policydb, ctx, secattr);
+	rc = mls_export_netlbl_cat(policydb, ctx, secattr);
 out:
-	read_unlock(&policy_rwlock);
+	read_unlock(&ns->ss->policy_rwlock);
 	return rc;
 }
 #endif /* CONFIG_NETLABEL */
@@ -3540,15 +3695,17 @@ int security_netlbl_sid_to_secattr(u32 sid, struct netlbl_lsm_secattr *secattr)
  * @len: length of data in bytes
  *
  */
-int security_read_policy(void **data, size_t *len)
+int security_read_policy(struct selinux_ns *ns,
+			 void **data, size_t *len)
 {
+	struct policydb *policydb = &ns->ss->policydb;
 	int rc;
 	struct policy_file fp;
 
-	if (!ss_initialized)
+	if (!ns->initialized)
 		return -EINVAL;
 
-	*len = security_policydb_len();
+	*len = security_policydb_len(ns);
 
 	*data = vmalloc_user(*len);
 	if (!*data)
@@ -3557,9 +3714,9 @@ int security_read_policy(void **data, size_t *len)
 	fp.data = *data;
 	fp.len = *len;
 
-	read_lock(&policy_rwlock);
-	rc = policydb_write(&policydb, &fp);
-	read_unlock(&policy_rwlock);
+	read_lock(&ns->ss->policy_rwlock);
+	rc = policydb_write(policydb, &fp);
+	read_unlock(&ns->ss->policy_rwlock);
 
 	if (rc)
 		return rc;
diff --git a/security/selinux/ss/services.h b/security/selinux/ss/services.h
index 3d9fa95..1dfa624 100644
--- a/security/selinux/ss/services.h
+++ b/security/selinux/ss/services.h
@@ -9,7 +9,27 @@
 #include "policydb.h"
 #include "sidtab.h"
 
-extern struct policydb policydb;
+/* Mapping for a single class */
+struct selinux_mapping {
+	u16 value; /* policy value for class */
+	unsigned int num_perms; /* number of permissions in class */
+	u32 perms[sizeof(u32) * 8]; /* policy values for permissions */
+};
+
+/* Map for all of the classes, with array size */
+struct selinux_map {
+	struct selinux_mapping *mapping; /* indexed by class */
+	u16 size; /* array size of mapping */
+};
+struct selinux_ss {
+	struct sidtab sidtab;
+	struct policydb policydb;
+	rwlock_t policy_rwlock;
+	u32 latest_granting;
+	struct selinux_map map;
+	struct page *status_page;
+	struct mutex status_lock;
+};
 
 void services_compute_xperms_drivers(struct extended_perms *xperms,
 				struct avtab_node *node);
@@ -18,4 +38,3 @@ void services_compute_xperms_decision(struct extended_perms_decision *xpermd,
 					struct avtab_node *node);
 
 #endif	/* _SS_SERVICES_H_ */
-
diff --git a/security/selinux/ss/status.c b/security/selinux/ss/status.c
index d982365..537badc 100644
--- a/security/selinux/ss/status.c
+++ b/security/selinux/ss/status.c
@@ -35,8 +35,6 @@
  * In most cases, application shall confirm the kernel status is not
  * changed without any system call invocations.
  */
-static struct page *selinux_status_page;
-static DEFINE_MUTEX(selinux_status_lock);
 
 /*
  * selinux_kernel_status_page
@@ -44,21 +42,21 @@ static DEFINE_MUTEX(selinux_status_lock);
  * It returns a reference to selinux_status_page. If the status page is
  * not allocated yet, it also tries to allocate it at the first time.
  */
-struct page *selinux_kernel_status_page(void)
+struct page *selinux_kernel_status_page(struct selinux_ns *ns)
 {
 	struct selinux_kernel_status   *status;
 	struct page		       *result = NULL;
 
-	mutex_lock(&selinux_status_lock);
-	if (!selinux_status_page) {
-		selinux_status_page = alloc_page(GFP_KERNEL|__GFP_ZERO);
+	mutex_lock(&ns->ss->status_lock);
+	if (!ns->ss->status_page) {
+		ns->ss->status_page = alloc_page(GFP_KERNEL|__GFP_ZERO);
 
-		if (selinux_status_page) {
-			status = page_address(selinux_status_page);
+		if (ns->ss->status_page) {
+			status = page_address(ns->ss->status_page);
 
 			status->version = SELINUX_KERNEL_STATUS_VERSION;
 			status->sequence = 0;
-			status->enforcing = selinux_enforcing;
+			status->enforcing = ns_enforcing(ns);
 			/*
 			 * NOTE: the next policyload event shall set
 			 * a positive value on the status->policyload,
@@ -66,11 +64,11 @@ struct page *selinux_kernel_status_page(void)
 			 * So, application can know it was updated.
 			 */
 			status->policyload = 0;
-			status->deny_unknown = !security_get_allow_unknown();
+			status->deny_unknown = !security_get_allow_unknown(ns);
 		}
 	}
-	result = selinux_status_page;
-	mutex_unlock(&selinux_status_lock);
+	result = ns->ss->status_page;
+	mutex_unlock(&ns->ss->status_lock);
 
 	return result;
 }
@@ -80,13 +78,13 @@ struct page *selinux_kernel_status_page(void)
  *
  * It updates status of the current enforcing/permissive mode.
  */
-void selinux_status_update_setenforce(int enforcing)
+void selinux_status_update_setenforce(struct selinux_ns *ns, int enforcing)
 {
 	struct selinux_kernel_status   *status;
 
-	mutex_lock(&selinux_status_lock);
-	if (selinux_status_page) {
-		status = page_address(selinux_status_page);
+	mutex_lock(&ns->ss->status_lock);
+	if (ns->ss->status_page) {
+		status = page_address(ns->ss->status_page);
 
 		status->sequence++;
 		smp_wmb();
@@ -96,7 +94,7 @@ void selinux_status_update_setenforce(int enforcing)
 		smp_wmb();
 		status->sequence++;
 	}
-	mutex_unlock(&selinux_status_lock);
+	mutex_unlock(&ns->ss->status_lock);
 }
 
 /*
@@ -105,22 +103,23 @@ void selinux_status_update_setenforce(int enforcing)
  * It updates status of the times of policy reloaded, and current
  * setting of deny_unknown.
  */
-void selinux_status_update_policyload(int seqno)
+void selinux_status_update_policyload(struct selinux_ns *ns,
+				      int seqno)
 {
 	struct selinux_kernel_status   *status;
 
-	mutex_lock(&selinux_status_lock);
-	if (selinux_status_page) {
-		status = page_address(selinux_status_page);
+	mutex_lock(&ns->ss->status_lock);
+	if (ns->ss->status_page) {
+		status = page_address(ns->ss->status_page);
 
 		status->sequence++;
 		smp_wmb();
 
 		status->policyload = seqno;
-		status->deny_unknown = !security_get_allow_unknown();
+		status->deny_unknown = !security_get_allow_unknown(ns);
 
 		smp_wmb();
 		status->sequence++;
 	}
-	mutex_unlock(&selinux_status_lock);
+	mutex_unlock(&ns->ss->status_lock);
 }
diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c
index 56e354f..410f19a 100644
--- a/security/selinux/xfrm.c
+++ b/security/selinux/xfrm.c
@@ -101,7 +101,8 @@ static int selinux_xfrm_alloc_user(struct xfrm_sec_ctx **ctxp,
 	ctx->ctx_len = str_len;
 	memcpy(ctx->ctx_str, &uctx[1], str_len);
 	ctx->ctx_str[str_len] = '\0';
-	rc = security_context_to_sid(ctx->ctx_str, str_len, &ctx->ctx_sid, gfp);
+	rc = security_context_to_sid(current_selinux_ns, ctx->ctx_str, str_len,
+				     &ctx->ctx_sid, gfp);
 	if (rc)
 		goto err;
 
@@ -352,7 +353,8 @@ int selinux_xfrm_state_alloc_acquire(struct xfrm_state *x,
 	if (secid == 0)
 		return -EINVAL;
 
-	rc = security_sid_to_context(secid, &ctx_str, &str_len);
+	rc = security_sid_to_context(current_selinux_ns, secid, &ctx_str,
+				     &str_len);
 	if (rc)
 		return rc;
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 02/10] selinux: support multiple selinuxfs instances
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
  2017-10-02 15:58 ` [RFC 01/10] selinux: introduce a selinux namespace Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-02 15:58 ` [RFC 03/10] selinux: move the AVC into the selinux namespace Stephen Smalley
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Support multiple selinuxfs instances, per namespace.  Move global
selinuxfs state to a per-instance structure (selinux_fs_info), and
include a reference to the corresponding selinux namespace in this
structure.  Pass this selinux namespace to all security server operations,
thereby ensuring that each selinuxfs instance presents a view of and acts
as an interface to a particular namespace.

The expected usage would be to unshare the SELinux namespace and
the mount namespace, and then mount a new selinuxfs instance.  The
new instance would then provide an interface for viewing and manipulating
the state of the new SELinux namespace and would not affect the parent
namespace in any manner.

This change by itself should have no effect on SELinux behavior or
APIs (userspace or LSM).  It merely wraps the selinuxfs global state,
links it to a particular selinux namespace (currently always the initial
namespace) and uses that namespace for all selinuxfs operations.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/include/security.h |   3 +-
 security/selinux/selinuxfs.c        | 480 ++++++++++++++++++++++--------------
 security/selinux/ss/services.c      |  13 +
 security/selinux/ss/status.c        |   4 +-
 4 files changed, 315 insertions(+), 185 deletions(-)

diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index b70d1dd..429e6f7 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -355,8 +355,7 @@ struct selinux_kernel_status {
 	 */
 } __packed;
 
-extern void selinux_status_update_setenforce(struct selinux_ns *ns,
-					     int enforcing);
+extern void selinux_status_update_setenforce(struct selinux_ns *ns);
 extern void selinux_status_update_policyload(struct selinux_ns *ns,
 					     int seqno);
 extern void selinux_complete_init(void);
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 07f2f8e..e29d60e 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -19,6 +19,7 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/fs.h>
+#include <linux/mount.h>
 #include <linux/mutex.h>
 #include <linux/init.h>
 #include <linux/string.h>
@@ -41,23 +42,6 @@
 #include "objsec.h"
 #include "conditional.h"
 
-static DEFINE_MUTEX(sel_mutex);
-
-/* global data for booleans */
-static struct dentry *bool_dir;
-static int bool_num;
-static char **bool_pending_names;
-static int *bool_pending_values;
-
-/* global data for classes */
-static struct dentry *class_dir;
-static unsigned long last_class_ino;
-
-static char policy_opened;
-
-/* global data for policy capabilities */
-static struct dentry *policycap_dir;
-
 enum sel_inos {
 	SEL_ROOT_INO = 2,
 	SEL_LOAD,	/* load policy */
@@ -82,7 +66,52 @@ enum sel_inos {
 	SEL_INO_NEXT,	/* The next inode number to use */
 };
 
-static unsigned long sel_last_ino = SEL_INO_NEXT - 1;
+struct selinux_fs_info {
+	struct dentry *bool_dir;
+	unsigned int bool_num;
+	char **bool_pending_names;
+	unsigned int *bool_pending_values;
+	struct dentry *class_dir;
+	unsigned long last_class_ino;
+	bool policy_opened;
+	struct dentry *policycap_dir;
+	struct mutex mutex;
+	unsigned long last_ino;
+	struct selinux_ns *ns;
+	struct super_block *sb;
+};
+
+static int selinux_fs_info_create(struct super_block *sb)
+{
+	struct selinux_fs_info *fsi;
+
+	fsi = kzalloc(sizeof(*fsi), GFP_KERNEL);
+	if (!fsi)
+		return -ENOMEM;
+
+	mutex_init(&fsi->mutex);
+	fsi->last_ino = SEL_INO_NEXT - 1;
+	fsi->ns = get_selinux_ns(current_selinux_ns);
+	fsi->sb = sb;
+	sb->s_fs_info = fsi;
+	return 0;
+}
+
+static void selinux_fs_info_free(struct super_block *sb)
+{
+	struct selinux_fs_info *fsi = sb->s_fs_info;
+	int i;
+
+	if (fsi) {
+		put_selinux_ns(fsi->ns);
+		for (i = 0; i < fsi->bool_num; i++)
+			kfree(fsi->bool_pending_names[i]);
+		kfree(fsi->bool_pending_names);
+		kfree(fsi->bool_pending_values);
+	}
+	kfree(sb->s_fs_info);
+	sb->s_fs_info = NULL;
+}
 
 #define SEL_INITCON_INO_OFFSET		0x01000000
 #define SEL_BOOL_INO_OFFSET		0x02000000
@@ -94,10 +123,12 @@ static unsigned long sel_last_ino = SEL_INO_NEXT - 1;
 static ssize_t sel_read_enforce(struct file *filp, char __user *buf,
 				size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char tmpbuf[TMPBUFLEN];
 	ssize_t length;
 
-	length = scnprintf(tmpbuf, TMPBUFLEN, "%d", selinux_enforcing);
+	length = scnprintf(tmpbuf, TMPBUFLEN, "%d", ns_enforcing(ns));
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
@@ -106,6 +137,8 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 				 size_t count, loff_t *ppos)
 
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page = NULL;
 	ssize_t length;
 	int new_value;
@@ -127,7 +160,7 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 
 	new_value = !!new_value;
 
-	if (new_value != selinux_enforcing) {
+	if (new_value != ns_enforcing(ns)) {
 		length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
 				      SECCLASS_SECURITY, SECURITY__SETENFORCE,
 				      NULL);
@@ -135,16 +168,15 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 			goto out;
 		audit_log(current->audit_context, GFP_KERNEL, AUDIT_MAC_STATUS,
 			"enforcing=%d old_enforcing=%d auid=%u ses=%u",
-			new_value, selinux_enforcing,
+			new_value, ns_enforcing(ns),
 			from_kuid(&init_user_ns, audit_get_loginuid(current)),
 			audit_get_sessionid(current));
-		selinux_enforcing = new_value;
-		if (selinux_enforcing)
+		ns_enforcing(ns) = new_value;
+		if (ns_enforcing(ns))
 			avc_ss_reset(0);
-		selnl_notify_setenforce(selinux_enforcing);
-		selinux_status_update_setenforce(current_selinux_ns,
-						 selinux_enforcing);
-		if (!selinux_enforcing)
+		selnl_notify_setenforce(ns_enforcing(ns));
+		selinux_status_update_setenforce(ns);
+		if (!ns_enforcing(ns))
 			call_lsm_notifier(LSM_POLICY_CHANGE, NULL);
 	}
 	length = count;
@@ -165,12 +197,14 @@ static const struct file_operations sel_enforce_ops = {
 static ssize_t sel_read_handle_unknown(struct file *filp, char __user *buf,
 					size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char tmpbuf[TMPBUFLEN];
 	ssize_t length;
 	ino_t ino = file_inode(filp)->i_ino;
 	int handle_unknown = (ino == SEL_REJECT_UNKNOWN) ?
-		security_get_reject_unknown(current_selinux_ns) :
-		!security_get_allow_unknown(current_selinux_ns);
+		security_get_reject_unknown(ns) :
+		!security_get_allow_unknown(ns);
 
 	length = scnprintf(tmpbuf, TMPBUFLEN, "%d", handle_unknown);
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
@@ -183,7 +217,9 @@ static const struct file_operations sel_handle_unknown_ops = {
 
 static int sel_open_handle_status(struct inode *inode, struct file *filp)
 {
-	struct page    *status = selinux_kernel_status_page(current_selinux_ns);
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
+	struct page    *status = selinux_kernel_status_page(ns);
 
 	if (!status)
 		return -ENOMEM;
@@ -239,6 +275,8 @@ static ssize_t sel_write_disable(struct file *file, const char __user *buf,
 				 size_t count, loff_t *ppos)
 
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page;
 	ssize_t length;
 	int new_value;
@@ -259,7 +297,7 @@ static ssize_t sel_write_disable(struct file *file, const char __user *buf,
 		goto out;
 
 	if (new_value) {
-		length = selinux_disable(current_selinux_ns);
+		length = selinux_disable(ns);
 		if (length)
 			goto out;
 		audit_log(current->audit_context, GFP_KERNEL, AUDIT_MAC_STATUS,
@@ -298,9 +336,9 @@ static const struct file_operations sel_policyvers_ops = {
 };
 
 /* declaration for sel_write_load */
-static int sel_make_bools(void);
-static int sel_make_classes(void);
-static int sel_make_policycap(void);
+static int sel_make_bools(struct selinux_fs_info *fsi);
+static int sel_make_classes(struct selinux_fs_info *fsi);
+static int sel_make_policycap(struct selinux_fs_info *fsi);
 
 /* declaration for sel_make_class_dirs */
 static struct dentry *sel_make_dir(struct dentry *dir, const char *name,
@@ -309,11 +347,13 @@ static struct dentry *sel_make_dir(struct dentry *dir, const char *name,
 static ssize_t sel_read_mls(struct file *filp, char __user *buf,
 				size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char tmpbuf[TMPBUFLEN];
 	ssize_t length;
 
 	length = scnprintf(tmpbuf, TMPBUFLEN, "%d",
-			   security_mls_enabled(current_selinux_ns));
+			   security_mls_enabled(ns));
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
@@ -329,12 +369,14 @@ struct policy_load_memory {
 
 static int sel_open_policy(struct inode *inode, struct file *filp)
 {
+	struct selinux_fs_info *fsi = inode->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	struct policy_load_memory *plm = NULL;
 	int rc;
 
 	BUG_ON(filp->private_data);
 
-	mutex_lock(&sel_mutex);
+	mutex_lock(&fsi->mutex);
 
 	rc = avc_has_perm(current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL);
@@ -342,7 +384,7 @@ static int sel_open_policy(struct inode *inode, struct file *filp)
 		goto err;
 
 	rc = -EBUSY;
-	if (policy_opened)
+	if (fsi->policy_opened)
 		goto err;
 
 	rc = -ENOMEM;
@@ -350,25 +392,25 @@ static int sel_open_policy(struct inode *inode, struct file *filp)
 	if (!plm)
 		goto err;
 
-	if (i_size_read(inode) != security_policydb_len(current_selinux_ns)) {
+	if (i_size_read(inode) != security_policydb_len(ns)) {
 		inode_lock(inode);
-		i_size_write(inode, security_policydb_len(current_selinux_ns));
+		i_size_write(inode, security_policydb_len(ns));
 		inode_unlock(inode);
 	}
 
-	rc = security_read_policy(current_selinux_ns, &plm->data, &plm->len);
+	rc = security_read_policy(ns, &plm->data, &plm->len);
 	if (rc)
 		goto err;
 
-	policy_opened = 1;
+	fsi->policy_opened = 1;
 
 	filp->private_data = plm;
 
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 
 	return 0;
 err:
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 
 	if (plm)
 		vfree(plm->data);
@@ -378,11 +420,12 @@ static int sel_open_policy(struct inode *inode, struct file *filp)
 
 static int sel_release_policy(struct inode *inode, struct file *filp)
 {
+	struct selinux_fs_info *fsi = inode->i_sb->s_fs_info;
 	struct policy_load_memory *plm = filp->private_data;
 
 	BUG_ON(!plm);
 
-	policy_opened = 0;
+	fsi->policy_opened = 0;
 
 	vfree(plm->data);
 	kfree(plm);
@@ -393,10 +436,11 @@ static int sel_release_policy(struct inode *inode, struct file *filp)
 static ssize_t sel_read_policy(struct file *filp, char __user *buf,
 			       size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
 	struct policy_load_memory *plm = filp->private_data;
 	int ret;
 
-	mutex_lock(&sel_mutex);
+	mutex_lock(&fsi->mutex);
 
 	ret = avc_has_perm(current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL);
@@ -405,7 +449,7 @@ static ssize_t sel_read_policy(struct file *filp, char __user *buf,
 
 	ret = simple_read_from_buffer(buf, count, ppos, plm->data, plm->len);
 out:
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 	return ret;
 }
 
@@ -459,14 +503,41 @@ static const struct file_operations sel_policy_ops = {
 	.llseek		= generic_file_llseek,
 };
 
+static int sel_make_policy_nodes(struct selinux_fs_info *fsi)
+{
+	int ret;
+
+	ret = sel_make_bools(fsi);
+	if (ret) {
+		pr_err("SELinux: failed to load policy booleans\n");
+		return ret;
+	}
+
+	ret = sel_make_classes(fsi);
+	if (ret) {
+		pr_err("SELinux: failed to load policy classes\n");
+		return ret;
+	}
+
+	ret = sel_make_policycap(fsi);
+	if (ret) {
+		pr_err("SELinux: failed to load policy capabilities\n");
+		return ret;
+	}
+
+	return 0;
+}
+
 static ssize_t sel_write_load(struct file *file, const char __user *buf,
 			      size_t count, loff_t *ppos)
 
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	ssize_t length;
 	void *data = NULL;
 
-	mutex_lock(&sel_mutex);
+	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__LOAD_POLICY, NULL);
@@ -491,29 +562,15 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf,
 	if (copy_from_user(data, buf, count) != 0)
 		goto out;
 
-	length = security_load_policy(current_selinux_ns, data, count);
+	length = security_load_policy(ns, data, count);
 	if (length) {
 		pr_warn_ratelimited("SELinux: failed to load policy\n");
 		goto out;
 	}
 
-	length = sel_make_bools();
-	if (length) {
-		pr_err("SELinux: failed to load policy booleans\n");
-		goto out1;
-	}
-
-	length = sel_make_classes();
-	if (length) {
-		pr_err("SELinux: failed to load policy classes\n");
-		goto out1;
-	}
-
-	length = sel_make_policycap();
-	if (length) {
-		pr_err("SELinux: failed to load policy capabilities\n");
+	length = sel_make_policy_nodes(fsi);
+	if (length)
 		goto out1;
-	}
 
 	length = count;
 
@@ -523,7 +580,7 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf,
 		from_kuid(&init_user_ns, audit_get_loginuid(current)),
 		audit_get_sessionid(current));
 out:
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 	vfree(data);
 	return length;
 }
@@ -535,6 +592,8 @@ static const struct file_operations sel_load_ops = {
 
 static ssize_t sel_write_context(struct file *file, char *buf, size_t size)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *canon = NULL;
 	u32 sid, len;
 	ssize_t length;
@@ -544,12 +603,11 @@ static ssize_t sel_write_context(struct file *file, char *buf, size_t size)
 	if (length)
 		goto out;
 
-	length = security_context_to_sid(current_selinux_ns, buf, size,
-					 &sid, GFP_KERNEL);
+	length = security_context_to_sid(ns, buf, size, &sid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(current_selinux_ns, sid, &canon, &len);
+	length = security_sid_to_context(ns, sid, &canon, &len);
 	if (length)
 		goto out;
 
@@ -570,16 +628,20 @@ static ssize_t sel_write_context(struct file *file, char *buf, size_t size)
 static ssize_t sel_read_checkreqprot(struct file *filp, char __user *buf,
 				     size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char tmpbuf[TMPBUFLEN];
 	ssize_t length;
 
-	length = scnprintf(tmpbuf, TMPBUFLEN, "%u", selinux_checkreqprot);
+	length = scnprintf(tmpbuf, TMPBUFLEN, "%u", ns->checkreqprot);
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
 static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 				      size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page;
 	ssize_t length;
 	unsigned int new_value;
@@ -605,7 +667,7 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 	if (sscanf(page, "%u", &new_value) != 1)
 		goto out;
 
-	selinux_checkreqprot = new_value ? 1 : 0;
+	ns->checkreqprot = new_value ? 1 : 0;
 	length = count;
 out:
 	kfree(page);
@@ -621,6 +683,8 @@ static ssize_t sel_write_validatetrans(struct file *file,
 					const char __user *buf,
 					size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *oldcon = NULL, *newcon = NULL, *taskcon = NULL;
 	char *req = NULL;
 	u32 osid, nsid, tsid;
@@ -665,23 +729,19 @@ static ssize_t sel_write_validatetrans(struct file *file,
 	if (sscanf(req, "%s %s %hu %s", oldcon, newcon, &tclass, taskcon) != 4)
 		goto out;
 
-	rc = security_context_str_to_sid(current_selinux_ns, oldcon, &osid,
-					 GFP_KERNEL);
+	rc = security_context_str_to_sid(ns, oldcon, &osid, GFP_KERNEL);
 	if (rc)
 		goto out;
 
-	rc = security_context_str_to_sid(current_selinux_ns, newcon, &nsid,
-					 GFP_KERNEL);
+	rc = security_context_str_to_sid(ns, newcon, &nsid, GFP_KERNEL);
 	if (rc)
 		goto out;
 
-	rc = security_context_str_to_sid(current_selinux_ns, taskcon, &tsid,
-					 GFP_KERNEL);
+	rc = security_context_str_to_sid(ns, taskcon, &tsid, GFP_KERNEL);
 	if (rc)
 		goto out;
 
-	rc = security_validate_transition_user(current_selinux_ns, osid, nsid,
-					       tsid, tclass);
+	rc = security_validate_transition_user(ns, osid, nsid, tsid, tclass);
 	if (!rc)
 		rc = count;
 out:
@@ -751,6 +811,8 @@ static const struct file_operations transaction_ops = {
 
 static ssize_t sel_write_access(struct file *file, char *buf, size_t size)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *scon = NULL, *tcon = NULL;
 	u32 ssid, tsid;
 	u16 tclass;
@@ -776,17 +838,15 @@ static ssize_t sel_write_access(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s %hu", scon, tcon, &tclass) != 3)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, scon, &ssid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, tcon, &tsid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	security_compute_av_user(current_selinux_ns, ssid, tsid, tclass, &avd);
+	security_compute_av_user(ns, ssid, tsid, tclass, &avd);
 
 	length = scnprintf(buf, SIMPLE_TRANSACTION_LIMIT,
 			  "%x %x %x %x %u %x",
@@ -801,6 +861,8 @@ static ssize_t sel_write_access(struct file *file, char *buf, size_t size)
 
 static ssize_t sel_write_create(struct file *file, char *buf, size_t size)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *scon = NULL, *tcon = NULL;
 	char *namebuf = NULL, *objname = NULL;
 	u32 ssid, tsid, newsid;
@@ -866,23 +928,20 @@ static ssize_t sel_write_create(struct file *file, char *buf, size_t size)
 		objname = namebuf;
 	}
 
-	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, scon, &ssid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, tcon, &tsid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_transition_sid_user(current_selinux_ns, ssid, tsid,
-					      tclass, objname, &newsid);
+	length = security_transition_sid_user(ns, ssid, tsid, tclass, objname,
+					      &newsid);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(current_selinux_ns, newsid, &newcon,
-					 &len);
+	length = security_sid_to_context(ns, newsid, &newcon, &len);
 	if (length)
 		goto out;
 
@@ -905,6 +964,8 @@ static ssize_t sel_write_create(struct file *file, char *buf, size_t size)
 
 static ssize_t sel_write_relabel(struct file *file, char *buf, size_t size)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *scon = NULL, *tcon = NULL;
 	u32 ssid, tsid, newsid;
 	u16 tclass;
@@ -932,23 +993,19 @@ static ssize_t sel_write_relabel(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s %hu", scon, tcon, &tclass) != 3)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, scon, &ssid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, tcon, &tsid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_change_sid(current_selinux_ns, ssid, tsid, tclass,
-				     &newsid);
+	length = security_change_sid(ns, ssid, tsid, tclass, &newsid);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(current_selinux_ns, newsid, &newcon,
-					 &len);
+	length = security_sid_to_context(ns, newsid, &newcon, &len);
 	if (length)
 		goto out;
 
@@ -967,6 +1024,8 @@ static ssize_t sel_write_relabel(struct file *file, char *buf, size_t size)
 
 static ssize_t sel_write_user(struct file *file, char *buf, size_t size)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *con = NULL, *user = NULL, *ptr;
 	u32 sid, *sids = NULL;
 	ssize_t length;
@@ -994,21 +1053,18 @@ static ssize_t sel_write_user(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s", con, user) != 2)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, con, &sid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, con, &sid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_get_user_sids(current_selinux_ns, sid, user, &sids,
-					&nsids);
+	length = security_get_user_sids(ns, sid, user, &sids, &nsids);
 	if (length)
 		goto out;
 
 	length = sprintf(buf, "%u", nsids) + 1;
 	ptr = buf + length;
 	for (i = 0; i < nsids; i++) {
-		rc = security_sid_to_context(current_selinux_ns, sids[i],
-					     &newcon, &len);
+		rc = security_sid_to_context(ns, sids[i], &newcon, &len);
 		if (rc) {
 			length = rc;
 			goto out;
@@ -1032,6 +1088,8 @@ static ssize_t sel_write_user(struct file *file, char *buf, size_t size)
 
 static ssize_t sel_write_member(struct file *file, char *buf, size_t size)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *scon = NULL, *tcon = NULL;
 	u32 ssid, tsid, newsid;
 	u16 tclass;
@@ -1059,23 +1117,19 @@ static ssize_t sel_write_member(struct file *file, char *buf, size_t size)
 	if (sscanf(buf, "%s %s %hu", scon, tcon, &tclass) != 3)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, scon, &ssid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, scon, &ssid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_context_str_to_sid(current_selinux_ns, tcon, &tsid,
-					     GFP_KERNEL);
+	length = security_context_str_to_sid(ns, tcon, &tsid, GFP_KERNEL);
 	if (length)
 		goto out;
 
-	length = security_member_sid(current_selinux_ns, ssid, tsid, tclass,
-				     &newsid);
+	length = security_member_sid(ns, ssid, tsid, tclass, &newsid);
 	if (length)
 		goto out;
 
-	length = security_sid_to_context(current_selinux_ns, newsid, &newcon,
-					 &len);
+	length = security_sid_to_context(ns, newsid, &newcon, &len);
 	if (length)
 		goto out;
 
@@ -1109,6 +1163,8 @@ static struct inode *sel_make_inode(struct super_block *sb, int mode)
 static ssize_t sel_read_bool(struct file *filep, char __user *buf,
 			     size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filep)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page = NULL;
 	ssize_t length;
 	ssize_t ret;
@@ -1116,10 +1172,11 @@ static ssize_t sel_read_bool(struct file *filep, char __user *buf,
 	unsigned index = file_inode(filep)->i_ino & SEL_INO_MASK;
 	const char *name = filep->f_path.dentry->d_name.name;
 
-	mutex_lock(&sel_mutex);
+	mutex_lock(&fsi->mutex);
 
 	ret = -EINVAL;
-	if (index >= bool_num || strcmp(name, bool_pending_names[index]))
+	if (index >= fsi->bool_num || strcmp(name,
+					     fsi->bool_pending_names[index]))
 		goto out;
 
 	ret = -ENOMEM;
@@ -1127,16 +1184,16 @@ static ssize_t sel_read_bool(struct file *filep, char __user *buf,
 	if (!page)
 		goto out;
 
-	cur_enforcing = security_get_bool_value(current_selinux_ns, index);
+	cur_enforcing = security_get_bool_value(ns, index);
 	if (cur_enforcing < 0) {
 		ret = cur_enforcing;
 		goto out;
 	}
 	length = scnprintf(page, PAGE_SIZE, "%d %d", cur_enforcing,
-			  bool_pending_values[index]);
+			  fsi->bool_pending_values[index]);
 	ret = simple_read_from_buffer(buf, count, ppos, page, length);
 out:
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 	free_page((unsigned long)page);
 	return ret;
 }
@@ -1144,13 +1201,14 @@ static ssize_t sel_read_bool(struct file *filep, char __user *buf,
 static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filep)->i_sb->s_fs_info;
 	char *page = NULL;
 	ssize_t length;
 	int new_value;
 	unsigned index = file_inode(filep)->i_ino & SEL_INO_MASK;
 	const char *name = filep->f_path.dentry->d_name.name;
 
-	mutex_lock(&sel_mutex);
+	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETBOOL,
@@ -1159,7 +1217,8 @@ static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
 		goto out;
 
 	length = -EINVAL;
-	if (index >= bool_num || strcmp(name, bool_pending_names[index]))
+	if (index >= fsi->bool_num || strcmp(name,
+					     fsi->bool_pending_names[index]))
 		goto out;
 
 	length = -ENOMEM;
@@ -1185,11 +1244,11 @@ static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
 	if (new_value)
 		new_value = 1;
 
-	bool_pending_values[index] = new_value;
+	fsi->bool_pending_values[index] = new_value;
 	length = count;
 
 out:
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 	kfree(page);
 	return length;
 }
@@ -1204,11 +1263,13 @@ static ssize_t sel_commit_bools_write(struct file *filep,
 				      const char __user *buf,
 				      size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filep)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page = NULL;
 	ssize_t length;
 	int new_value;
 
-	mutex_lock(&sel_mutex);
+	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETBOOL,
@@ -1237,15 +1298,15 @@ static ssize_t sel_commit_bools_write(struct file *filep,
 		goto out;
 
 	length = 0;
-	if (new_value && bool_pending_values)
-		length = security_set_bools(current_selinux_ns, bool_num,
-					    bool_pending_values);
+	if (new_value && fsi->bool_pending_values)
+		length = security_set_bools(ns, fsi->bool_num,
+					    fsi->bool_pending_values);
 
 	if (!length)
 		length = count;
 
 out:
-	mutex_unlock(&sel_mutex);
+	mutex_unlock(&fsi->mutex);
 	kfree(page);
 	return length;
 }
@@ -1263,12 +1324,12 @@ static void sel_remove_entries(struct dentry *de)
 
 #define BOOL_DIR_NAME "booleans"
 
-static int sel_make_bools(void)
+static int sel_make_bools(struct selinux_fs_info *fsi)
 {
 	int i, ret;
 	ssize_t len;
 	struct dentry *dentry = NULL;
-	struct dentry *dir = bool_dir;
+	struct dentry *dir = fsi->bool_dir;
 	struct inode *inode = NULL;
 	struct inode_security_struct *isec;
 	char **names = NULL, *page;
@@ -1277,13 +1338,13 @@ static int sel_make_bools(void)
 	u32 sid;
 
 	/* remove any existing files */
-	for (i = 0; i < bool_num; i++)
-		kfree(bool_pending_names[i]);
-	kfree(bool_pending_names);
-	kfree(bool_pending_values);
-	bool_num = 0;
-	bool_pending_names = NULL;
-	bool_pending_values = NULL;
+	for (i = 0; i < fsi->bool_num; i++)
+		kfree(fsi->bool_pending_names[i]);
+	kfree(fsi->bool_pending_names);
+	kfree(fsi->bool_pending_values);
+	fsi->bool_num = 0;
+	fsi->bool_pending_names = NULL;
+	fsi->bool_pending_values = NULL;
 
 	sel_remove_entries(dir);
 
@@ -1292,7 +1353,7 @@ static int sel_make_bools(void)
 	if (!page)
 		goto out;
 
-	ret = security_get_bools(current_selinux_ns, &num, &names, &values);
+	ret = security_get_bools(fsi->ns, &num, &names, &values);
 	if (ret)
 		goto out;
 
@@ -1313,7 +1374,7 @@ static int sel_make_bools(void)
 			goto out;
 
 		isec = (struct inode_security_struct *)inode->i_security;
-		ret = security_genfs_sid(current_selinux_ns, "selinuxfs", page,
+		ret = security_genfs_sid(fsi->ns, "selinuxfs", page,
 					 SECCLASS_FILE, &sid);
 		if (ret) {
 			pr_warn_ratelimited("SELinux: no sid found, defaulting to security isid for %s\n",
@@ -1327,9 +1388,9 @@ static int sel_make_bools(void)
 		inode->i_ino = i|SEL_BOOL_INO_OFFSET;
 		d_add(dentry, inode);
 	}
-	bool_num = num;
-	bool_pending_names = names;
-	bool_pending_values = values;
+	fsi->bool_num = num;
+	fsi->bool_pending_names = names;
+	fsi->bool_pending_values = values;
 
 	free_page((unsigned long)page);
 	return 0;
@@ -1347,10 +1408,6 @@ static int sel_make_bools(void)
 	return ret;
 }
 
-#define NULL_FILE_NAME "null"
-
-struct path selinux_null;
-
 static ssize_t sel_read_avc_cache_threshold(struct file *filp, char __user *buf,
 					    size_t count, loff_t *ppos)
 {
@@ -1500,6 +1557,8 @@ static const struct file_operations sel_avc_cache_stats_ops = {
 
 static int sel_make_avc_files(struct dentry *dir)
 {
+	struct super_block *sb = dir->d_sb;
+	struct selinux_fs_info *fsi = sb->s_fs_info;
 	int i;
 	static const struct tree_descr files[] = {
 		{ "cache_threshold",
@@ -1523,7 +1582,7 @@ static int sel_make_avc_files(struct dentry *dir)
 			return -ENOMEM;
 
 		inode->i_fop = files[i].ops;
-		inode->i_ino = ++sel_last_ino;
+		inode->i_ino = ++fsi->last_ino;
 		d_add(dentry, inode);
 	}
 
@@ -1533,12 +1592,14 @@ static int sel_make_avc_files(struct dentry *dir)
 static ssize_t sel_read_initcon(struct file *file, char __user *buf,
 				size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *con;
 	u32 sid, len;
 	ssize_t ret;
 
 	sid = file_inode(file)->i_ino&SEL_INO_MASK;
-	ret = security_sid_to_context(current_selinux_ns, sid, &con, &len);
+	ret = security_sid_to_context(ns, sid, &con, &len);
 	if (ret)
 		return ret;
 
@@ -1626,13 +1687,14 @@ static const struct file_operations sel_perm_ops = {
 static ssize_t sel_read_policycap(struct file *file, char __user *buf,
 				  size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	int value;
 	char tmpbuf[TMPBUFLEN];
 	ssize_t length;
 	unsigned long i_ino = file_inode(file)->i_ino;
 
-	value = security_policycap_supported(current_selinux_ns,
-					     i_ino & SEL_INO_MASK);
+	value = security_policycap_supported(ns, i_ino & SEL_INO_MASK);
 	length = scnprintf(tmpbuf, TMPBUFLEN, "%d", value);
 
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
@@ -1646,11 +1708,12 @@ static const struct file_operations sel_policycap_ops = {
 static int sel_make_perm_files(char *objclass, int classvalue,
 				struct dentry *dir)
 {
+	struct selinux_fs_info *fsi = dir->d_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	int i, rc, nperms;
 	char **perms;
 
-	rc = security_get_permissions(current_selinux_ns, objclass, &perms,
-				      &nperms);
+	rc = security_get_permissions(ns, objclass, &perms, &nperms);
 	if (rc)
 		return rc;
 
@@ -1684,6 +1747,8 @@ static int sel_make_perm_files(char *objclass, int classvalue,
 static int sel_make_class_dir_entries(char *classname, int index,
 					struct dentry *dir)
 {
+	struct super_block *sb = dir->d_sb;
+	struct selinux_fs_info *fsi = sb->s_fs_info;
 	struct dentry *dentry = NULL;
 	struct inode *inode = NULL;
 	int rc;
@@ -1700,7 +1765,7 @@ static int sel_make_class_dir_entries(char *classname, int index,
 	inode->i_ino = sel_class_to_ino(index);
 	d_add(dentry, inode);
 
-	dentry = sel_make_dir(dir, "perms", &last_class_ino);
+	dentry = sel_make_dir(dir, "perms", &fsi->last_class_ino);
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
@@ -1709,26 +1774,27 @@ static int sel_make_class_dir_entries(char *classname, int index,
 	return rc;
 }
 
-static int sel_make_classes(void)
+static int sel_make_classes(struct selinux_fs_info *fsi)
 {
+
 	int rc, nclasses, i;
 	char **classes;
 
 	/* delete any existing entries */
-	sel_remove_entries(class_dir);
+	sel_remove_entries(fsi->class_dir);
 
-	rc = security_get_classes(current_selinux_ns, &classes, &nclasses);
+	rc = security_get_classes(fsi->ns, &classes, &nclasses);
 	if (rc)
 		return rc;
 
 	/* +2 since classes are 1-indexed */
-	last_class_ino = sel_class_to_ino(nclasses + 2);
+	fsi->last_class_ino = sel_class_to_ino(nclasses + 2);
 
 	for (i = 0; i < nclasses; i++) {
 		struct dentry *class_name_dir;
 
-		class_name_dir = sel_make_dir(class_dir, classes[i],
-				&last_class_ino);
+		class_name_dir = sel_make_dir(fsi->class_dir, classes[i],
+					      &fsi->last_class_ino);
 		if (IS_ERR(class_name_dir)) {
 			rc = PTR_ERR(class_name_dir);
 			goto out;
@@ -1748,25 +1814,25 @@ static int sel_make_classes(void)
 	return rc;
 }
 
-static int sel_make_policycap(void)
+static int sel_make_policycap(struct selinux_fs_info *fsi)
 {
 	unsigned int iter;
 	struct dentry *dentry = NULL;
 	struct inode *inode = NULL;
 
-	sel_remove_entries(policycap_dir);
+	sel_remove_entries(fsi->policycap_dir);
 
 	for (iter = 0; iter <= POLICYDB_CAPABILITY_MAX; iter++) {
 		if (iter < ARRAY_SIZE(selinux_policycap_names))
-			dentry = d_alloc_name(policycap_dir,
+			dentry = d_alloc_name(fsi->policycap_dir,
 					      selinux_policycap_names[iter]);
 		else
-			dentry = d_alloc_name(policycap_dir, "unknown");
+			dentry = d_alloc_name(fsi->policycap_dir, "unknown");
 
 		if (dentry == NULL)
 			return -ENOMEM;
 
-		inode = sel_make_inode(policycap_dir->d_sb, S_IFREG | S_IRUGO);
+		inode = sel_make_inode(fsi->sb, S_IFREG | 0444);
 		if (inode == NULL)
 			return -ENOMEM;
 
@@ -1805,8 +1871,11 @@ static struct dentry *sel_make_dir(struct dentry *dir, const char *name,
 	return dentry;
 }
 
+#define NULL_FILE_NAME "null"
+
 static int sel_fill_super(struct super_block *sb, void *data, int silent)
 {
+	struct selinux_fs_info *fsi;
 	int ret;
 	struct dentry *dentry;
 	struct inode *inode;
@@ -1834,14 +1903,20 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 					S_IWUGO},
 		/* last one */ {""}
 	};
+
+	ret = selinux_fs_info_create(sb);
+	if (ret)
+		goto err;
+
 	ret = simple_fill_super(sb, SELINUX_MAGIC, selinux_files);
 	if (ret)
 		goto err;
 
-	bool_dir = sel_make_dir(sb->s_root, BOOL_DIR_NAME, &sel_last_ino);
-	if (IS_ERR(bool_dir)) {
-		ret = PTR_ERR(bool_dir);
-		bool_dir = NULL;
+	fsi = sb->s_fs_info;
+	fsi->bool_dir = sel_make_dir(sb->s_root, BOOL_DIR_NAME, &fsi->last_ino);
+	if (IS_ERR(fsi->bool_dir)) {
+		ret = PTR_ERR(fsi->bool_dir);
+		fsi->bool_dir = NULL;
 		goto err;
 	}
 
@@ -1855,7 +1930,7 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 	if (!inode)
 		goto err;
 
-	inode->i_ino = ++sel_last_ino;
+	inode->i_ino = ++fsi->last_ino;
 	isec = (struct inode_security_struct *)inode->i_security;
 	isec->sid = SECINITSID_DEVNULL;
 	isec->sclass = SECCLASS_CHR_FILE;
@@ -1863,9 +1938,8 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 
 	init_special_inode(inode, S_IFCHR | S_IRUGO | S_IWUGO, MKDEV(MEM_MAJOR, 3));
 	d_add(dentry, inode);
-	selinux_null.dentry = dentry;
 
-	dentry = sel_make_dir(sb->s_root, "avc", &sel_last_ino);
+	dentry = sel_make_dir(sb->s_root, "avc", &fsi->last_ino);
 	if (IS_ERR(dentry)) {
 		ret = PTR_ERR(dentry);
 		goto err;
@@ -1875,7 +1949,7 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 	if (ret)
 		goto err;
 
-	dentry = sel_make_dir(sb->s_root, "initial_contexts", &sel_last_ino);
+	dentry = sel_make_dir(sb->s_root, "initial_contexts", &fsi->last_ino);
 	if (IS_ERR(dentry)) {
 		ret = PTR_ERR(dentry);
 		goto err;
@@ -1885,42 +1959,79 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 	if (ret)
 		goto err;
 
-	class_dir = sel_make_dir(sb->s_root, "class", &sel_last_ino);
-	if (IS_ERR(class_dir)) {
-		ret = PTR_ERR(class_dir);
-		class_dir = NULL;
+	fsi->class_dir = sel_make_dir(sb->s_root, "class", &fsi->last_ino);
+	if (IS_ERR(fsi->class_dir)) {
+		ret = PTR_ERR(fsi->class_dir);
+		fsi->class_dir = NULL;
 		goto err;
 	}
 
-	policycap_dir = sel_make_dir(sb->s_root, "policy_capabilities", &sel_last_ino);
-	if (IS_ERR(policycap_dir)) {
-		ret = PTR_ERR(policycap_dir);
-		policycap_dir = NULL;
+	fsi->policycap_dir = sel_make_dir(sb->s_root, "policy_capabilities",
+					  &fsi->last_ino);
+	if (IS_ERR(fsi->policycap_dir)) {
+		ret = PTR_ERR(fsi->policycap_dir);
+		fsi->policycap_dir = NULL;
 		goto err;
 	}
+
+	ret = sel_make_policy_nodes(fsi);
+	if (ret)
+		goto err;
 	return 0;
 err:
 	printk(KERN_ERR "SELinux: %s:  failed while creating inodes\n",
 		__func__);
+
 	return ret;
 }
 
+static int selinuxfs_compare(struct super_block *sb, void *p)
+{
+	struct selinux_fs_info *fsi = sb->s_fs_info;
+
+	return (current_selinux_ns == fsi->ns);
+}
+
 static struct dentry *sel_mount(struct file_system_type *fs_type,
 		      int flags, const char *dev_name, void *data)
 {
-	return mount_single(fs_type, flags, data, sel_fill_super);
+	int (*fill_super)(struct super_block *, void *, int) = sel_fill_super;
+	struct super_block *s;
+	int error;
+
+	s = sget(fs_type, selinuxfs_compare, set_anon_super, flags, NULL);
+	if (IS_ERR(s))
+		return ERR_CAST(s);
+	if (!s->s_root) {
+		error = fill_super(s, data, flags & MS_SILENT ? 1 : 0);
+		if (error) {
+			deactivate_locked_super(s);
+			return ERR_PTR(error);
+		}
+		s->s_flags |= MS_ACTIVE;
+	}
+	return dget(s->s_root);
+}
+
+static void sel_kill_sb(struct super_block *sb)
+{
+	selinux_fs_info_free(sb);
+	kill_litter_super(sb);
 }
 
 static struct file_system_type sel_fs_type = {
 	.name		= "selinuxfs",
 	.mount		= sel_mount,
-	.kill_sb	= kill_litter_super,
+	.kill_sb	= sel_kill_sb,
 };
 
 struct vfsmount *selinuxfs_mount;
+struct path selinux_null;
 
 static int __init init_sel_fs(void)
 {
+	struct qstr null_name = QSTR_INIT(NULL_FILE_NAME,
+					  sizeof(NULL_FILE_NAME)-1);
 	int err;
 
 	if (!selinux_enabled)
@@ -1942,6 +2053,13 @@ static int __init init_sel_fs(void)
 		err = PTR_ERR(selinuxfs_mount);
 		selinuxfs_mount = NULL;
 	}
+	selinux_null.dentry = d_hash_and_lookup(selinux_null.mnt->mnt_root,
+						&null_name);
+	if (IS_ERR(selinux_null.dentry)) {
+		printk(KERN_ERR "selinuxfs:  could not lookup null!\n");
+		err = PTR_ERR(selinux_null.dentry);
+		selinux_null.dentry = NULL;
+	}
 
 	return err;
 }
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index bc2eacd..1e202b0 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -2823,6 +2823,13 @@ int security_get_bools(struct selinux_ns *ns,
 	struct policydb *policydb;
 	int i, rc;
 
+	if (!ns->initialized) {
+		*len = 0;
+		*names = NULL;
+		*values = NULL;
+		return 0;
+	}
+
 	read_lock(&ns->ss->policy_rwlock);
 
 	policydb = &ns->ss->policydb;
@@ -3153,6 +3160,12 @@ int security_get_classes(struct selinux_ns *ns,
 	struct policydb *policydb = &ns->ss->policydb;
 	int rc;
 
+	if (!ns->initialized) {
+		*nclasses = 0;
+		*classes = NULL;
+		return 0;
+	}
+
 	read_lock(&ns->ss->policy_rwlock);
 
 	rc = -ENOMEM;
diff --git a/security/selinux/ss/status.c b/security/selinux/ss/status.c
index 537badc..545686e 100644
--- a/security/selinux/ss/status.c
+++ b/security/selinux/ss/status.c
@@ -78,7 +78,7 @@ struct page *selinux_kernel_status_page(struct selinux_ns *ns)
  *
  * It updates status of the current enforcing/permissive mode.
  */
-void selinux_status_update_setenforce(struct selinux_ns *ns, int enforcing)
+void selinux_status_update_setenforce(struct selinux_ns *ns)
 {
 	struct selinux_kernel_status   *status;
 
@@ -89,7 +89,7 @@ void selinux_status_update_setenforce(struct selinux_ns *ns, int enforcing)
 		status->sequence++;
 		smp_wmb();
 
-		status->enforcing = enforcing;
+		status->enforcing = ns_enforcing(ns);
 
 		smp_wmb();
 		status->sequence++;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 03/10] selinux: move the AVC into the selinux namespace
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
  2017-10-02 15:58 ` [RFC 01/10] selinux: introduce a selinux namespace Stephen Smalley
  2017-10-02 15:58 ` [RFC 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-09  3:10   ` James Morris
  2017-10-02 15:58 ` [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace Stephen Smalley
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Move the access vector cache (AVC) into the selinux namespace
structure and pass it explicitly to all AVC functions.  The
AVC private state is encapsulated in a selinux_avc structure
that is allocated and freed by the AVC during selinux namespace
creation and destruction.

This is necessary to support multiple selinux namespaces since
the AVC caches state (e.g. SIDs, policy sequence number) that
is maintained and provided by the security server on a per-namespace
basis.

This change by itself should have no effect on SELinux behavior or
APIs (userspace or LSM).

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/avc.c              | 293 +++++++++++++++------------
 security/selinux/hooks.c            | 382 ++++++++++++++++++++++++------------
 security/selinux/include/avc.h      |  32 +--
 security/selinux/include/avc_ss.h   |   3 +-
 security/selinux/include/security.h |   5 +
 security/selinux/netlabel.c         |   3 +-
 security/selinux/selinuxfs.c        |  60 ++++--
 security/selinux/ss/services.c      |   9 +-
 security/selinux/xfrm.c             |  17 +-
 9 files changed, 515 insertions(+), 289 deletions(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index a5a4d05a..bce8fa1 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -82,14 +82,56 @@ struct avc_callback_node {
 	struct avc_callback_node *next;
 };
 
-/* Exported via selinufs */
-unsigned int avc_cache_threshold = AVC_DEF_CACHE_THRESHOLD;
-
 #ifdef CONFIG_SECURITY_SELINUX_AVC_STATS
 DEFINE_PER_CPU(struct avc_cache_stats, avc_cache_stats) = { 0 };
 #endif
 
-static struct avc_cache avc_cache;
+struct selinux_avc {
+	unsigned int avc_cache_threshold;
+	struct avc_cache avc_cache;
+};
+
+int selinux_avc_create(struct selinux_avc **avc)
+{
+	struct selinux_avc *newavc;
+	int i;
+
+	newavc = kzalloc(sizeof(*newavc), GFP_KERNEL);
+	if (!newavc)
+		return -ENOMEM;
+
+	newavc->avc_cache_threshold = AVC_DEF_CACHE_THRESHOLD;
+
+	for (i = 0; i < AVC_CACHE_SLOTS; i++) {
+		INIT_HLIST_HEAD(&newavc->avc_cache.slots[i]);
+		spin_lock_init(&newavc->avc_cache.slots_lock[i]);
+	}
+	atomic_set(&newavc->avc_cache.active_nodes, 0);
+	atomic_set(&newavc->avc_cache.lru_hint, 0);
+
+	*avc = newavc;
+	return 0;
+}
+
+static void avc_flush(struct selinux_avc *avc);
+
+void selinux_avc_free(struct selinux_avc *avc)
+{
+	avc_flush(avc);
+	kfree(avc);
+}
+
+unsigned int avc_get_cache_threshold(struct selinux_avc *avc)
+{
+	return avc->avc_cache_threshold;
+}
+
+void avc_set_cache_threshold(struct selinux_avc *avc,
+			     unsigned int cache_threshold)
+{
+	avc->avc_cache_threshold = cache_threshold;
+}
+
 static struct avc_callback_node *avc_callbacks;
 static struct kmem_cache *avc_node_cachep;
 static struct kmem_cache *avc_xperms_data_cachep;
@@ -143,14 +185,14 @@ static void avc_dump_av(struct audit_buffer *ab, u16 tclass, u32 av)
  * @tsid: target security identifier
  * @tclass: target security class
  */
-static void avc_dump_query(struct audit_buffer *ab, u32 ssid, u32 tsid, u16 tclass)
+static void avc_dump_query(struct audit_buffer *ab, struct selinux_ns *ns,
+			   u32 ssid, u32 tsid, u16 tclass)
 {
 	int rc;
 	char *scontext;
 	u32 scontext_len;
 
-	rc = security_sid_to_context(current_selinux_ns, ssid,
-				     &scontext, &scontext_len);
+	rc = security_sid_to_context(ns, ssid, &scontext, &scontext_len);
 	if (rc)
 		audit_log_format(ab, "ssid=%d", ssid);
 	else {
@@ -158,8 +200,7 @@ static void avc_dump_query(struct audit_buffer *ab, u32 ssid, u32 tsid, u16 tcla
 		kfree(scontext);
 	}
 
-	rc = security_sid_to_context(current_selinux_ns, tsid,
-				     &scontext, &scontext_len);
+	rc = security_sid_to_context(ns, tsid, &scontext, &scontext_len);
 	if (rc)
 		audit_log_format(ab, " tsid=%d", tsid);
 	else {
@@ -178,15 +219,6 @@ static void avc_dump_query(struct audit_buffer *ab, u32 ssid, u32 tsid, u16 tcla
  */
 void __init avc_init(void)
 {
-	int i;
-
-	for (i = 0; i < AVC_CACHE_SLOTS; i++) {
-		INIT_HLIST_HEAD(&avc_cache.slots[i]);
-		spin_lock_init(&avc_cache.slots_lock[i]);
-	}
-	atomic_set(&avc_cache.active_nodes, 0);
-	atomic_set(&avc_cache.lru_hint, 0);
-
 	avc_node_cachep = kmem_cache_create("avc_node", sizeof(struct avc_node),
 					0, SLAB_PANIC, NULL);
 	avc_xperms_cachep = kmem_cache_create("avc_xperms_node",
@@ -201,7 +233,7 @@ void __init avc_init(void)
 					0, SLAB_PANIC, NULL);
 }
 
-int avc_get_hash_stats(char *page)
+int avc_get_hash_stats(struct selinux_avc *avc, char *page)
 {
 	int i, chain_len, max_chain_len, slots_used;
 	struct avc_node *node;
@@ -212,7 +244,7 @@ int avc_get_hash_stats(char *page)
 	slots_used = 0;
 	max_chain_len = 0;
 	for (i = 0; i < AVC_CACHE_SLOTS; i++) {
-		head = &avc_cache.slots[i];
+		head = &avc->avc_cache.slots[i];
 		if (!hlist_empty(head)) {
 			slots_used++;
 			chain_len = 0;
@@ -227,7 +259,7 @@ int avc_get_hash_stats(char *page)
 
 	return scnprintf(page, PAGE_SIZE, "entries: %d\nbuckets used: %d/%d\n"
 			 "longest chain: %d\n",
-			 atomic_read(&avc_cache.active_nodes),
+			 atomic_read(&avc->avc_cache.active_nodes),
 			 slots_used, AVC_CACHE_SLOTS, max_chain_len);
 }
 
@@ -464,11 +496,12 @@ static inline u32 avc_xperms_audit_required(u32 requested,
 	return audited;
 }
 
-static inline int avc_xperms_audit(u32 ssid, u32 tsid, u16 tclass,
-				u32 requested, struct av_decision *avd,
-				struct extended_perms_decision *xpd,
-				u8 perm, int result,
-				struct common_audit_data *ad)
+static inline int avc_xperms_audit(struct selinux_ns *ns,
+				   u32 ssid, u32 tsid, u16 tclass,
+				   u32 requested, struct av_decision *avd,
+				   struct extended_perms_decision *xpd,
+				   u8 perm, int result,
+				   struct common_audit_data *ad)
 {
 	u32 audited, denied;
 
@@ -476,7 +509,7 @@ static inline int avc_xperms_audit(u32 ssid, u32 tsid, u16 tclass,
 			requested, avd, xpd, perm, result, &denied);
 	if (likely(!audited))
 		return 0;
-	return slow_avc_audit(ssid, tsid, tclass, requested,
+	return slow_avc_audit(ns, ssid, tsid, tclass, requested,
 			audited, denied, result, ad, 0);
 }
 
@@ -488,29 +521,30 @@ static void avc_node_free(struct rcu_head *rhead)
 	avc_cache_stats_incr(frees);
 }
 
-static void avc_node_delete(struct avc_node *node)
+static void avc_node_delete(struct selinux_avc *avc, struct avc_node *node)
 {
 	hlist_del_rcu(&node->list);
 	call_rcu(&node->rhead, avc_node_free);
-	atomic_dec(&avc_cache.active_nodes);
+	atomic_dec(&avc->avc_cache.active_nodes);
 }
 
-static void avc_node_kill(struct avc_node *node)
+static void avc_node_kill(struct selinux_avc *avc, struct avc_node *node)
 {
 	avc_xperms_free(node->ae.xp_node);
 	kmem_cache_free(avc_node_cachep, node);
 	avc_cache_stats_incr(frees);
-	atomic_dec(&avc_cache.active_nodes);
+	atomic_dec(&avc->avc_cache.active_nodes);
 }
 
-static void avc_node_replace(struct avc_node *new, struct avc_node *old)
+static void avc_node_replace(struct selinux_avc *avc,
+			     struct avc_node *new, struct avc_node *old)
 {
 	hlist_replace_rcu(&old->list, &new->list);
 	call_rcu(&old->rhead, avc_node_free);
-	atomic_dec(&avc_cache.active_nodes);
+	atomic_dec(&avc->avc_cache.active_nodes);
 }
 
-static inline int avc_reclaim_node(void)
+static inline int avc_reclaim_node(struct selinux_avc *avc)
 {
 	struct avc_node *node;
 	int hvalue, try, ecx;
@@ -519,16 +553,17 @@ static inline int avc_reclaim_node(void)
 	spinlock_t *lock;
 
 	for (try = 0, ecx = 0; try < AVC_CACHE_SLOTS; try++) {
-		hvalue = atomic_inc_return(&avc_cache.lru_hint) & (AVC_CACHE_SLOTS - 1);
-		head = &avc_cache.slots[hvalue];
-		lock = &avc_cache.slots_lock[hvalue];
+		hvalue = atomic_inc_return(&avc->avc_cache.lru_hint) &
+			(AVC_CACHE_SLOTS - 1);
+		head = &avc->avc_cache.slots[hvalue];
+		lock = &avc->avc_cache.slots_lock[hvalue];
 
 		if (!spin_trylock_irqsave(lock, flags))
 			continue;
 
 		rcu_read_lock();
 		hlist_for_each_entry(node, head, list) {
-			avc_node_delete(node);
+			avc_node_delete(avc, node);
 			avc_cache_stats_incr(reclaims);
 			ecx++;
 			if (ecx >= AVC_CACHE_RECLAIM) {
@@ -544,7 +579,7 @@ static inline int avc_reclaim_node(void)
 	return ecx;
 }
 
-static struct avc_node *avc_alloc_node(void)
+static struct avc_node *avc_alloc_node(struct selinux_avc *avc)
 {
 	struct avc_node *node;
 
@@ -555,8 +590,9 @@ static struct avc_node *avc_alloc_node(void)
 	INIT_HLIST_NODE(&node->list);
 	avc_cache_stats_incr(allocations);
 
-	if (atomic_inc_return(&avc_cache.active_nodes) > avc_cache_threshold)
-		avc_reclaim_node();
+	if (atomic_inc_return(&avc->avc_cache.active_nodes) >
+	    avc->avc_cache_threshold)
+		avc_reclaim_node(avc);
 
 out:
 	return node;
@@ -570,14 +606,15 @@ static void avc_node_populate(struct avc_node *node, u32 ssid, u32 tsid, u16 tcl
 	memcpy(&node->ae.avd, avd, sizeof(node->ae.avd));
 }
 
-static inline struct avc_node *avc_search_node(u32 ssid, u32 tsid, u16 tclass)
+static inline struct avc_node *avc_search_node(struct selinux_avc *avc,
+					       u32 ssid, u32 tsid, u16 tclass)
 {
 	struct avc_node *node, *ret = NULL;
 	int hvalue;
 	struct hlist_head *head;
 
 	hvalue = avc_hash(ssid, tsid, tclass);
-	head = &avc_cache.slots[hvalue];
+	head = &avc->avc_cache.slots[hvalue];
 	hlist_for_each_entry_rcu(node, head, list) {
 		if (ssid == node->ae.ssid &&
 		    tclass == node->ae.tclass &&
@@ -602,12 +639,13 @@ static inline struct avc_node *avc_search_node(u32 ssid, u32 tsid, u16 tclass)
  * then this function returns the avc_node.
  * Otherwise, this function returns NULL.
  */
-static struct avc_node *avc_lookup(u32 ssid, u32 tsid, u16 tclass)
+static struct avc_node *avc_lookup(struct selinux_avc *avc,
+				   u32 ssid, u32 tsid, u16 tclass)
 {
 	struct avc_node *node;
 
 	avc_cache_stats_incr(lookups);
-	node = avc_search_node(ssid, tsid, tclass);
+	node = avc_search_node(avc, ssid, tsid, tclass);
 
 	if (node)
 		return node;
@@ -616,7 +654,8 @@ static struct avc_node *avc_lookup(u32 ssid, u32 tsid, u16 tclass)
 	return NULL;
 }
 
-static int avc_latest_notif_update(int seqno, int is_insert)
+static int avc_latest_notif_update(struct selinux_avc *avc,
+				   int seqno, int is_insert)
 {
 	int ret = 0;
 	static DEFINE_SPINLOCK(notif_lock);
@@ -624,14 +663,14 @@ static int avc_latest_notif_update(int seqno, int is_insert)
 
 	spin_lock_irqsave(&notif_lock, flag);
 	if (is_insert) {
-		if (seqno < avc_cache.latest_notif) {
+		if (seqno < avc->avc_cache.latest_notif) {
 			printk(KERN_WARNING "SELinux: avc:  seqno %d < latest_notif %d\n",
-			       seqno, avc_cache.latest_notif);
+			       seqno, avc->avc_cache.latest_notif);
 			ret = -EAGAIN;
 		}
 	} else {
-		if (seqno > avc_cache.latest_notif)
-			avc_cache.latest_notif = seqno;
+		if (seqno > avc->avc_cache.latest_notif)
+			avc->avc_cache.latest_notif = seqno;
 	}
 	spin_unlock_irqrestore(&notif_lock, flag);
 
@@ -656,18 +695,19 @@ static int avc_latest_notif_update(int seqno, int is_insert)
  * the access vectors into a cache entry, returns
  * avc_node inserted. Otherwise, this function returns NULL.
  */
-static struct avc_node *avc_insert(u32 ssid, u32 tsid, u16 tclass,
-				struct av_decision *avd,
-				struct avc_xperms_node *xp_node)
+static struct avc_node *avc_insert(struct selinux_avc *avc,
+				   u32 ssid, u32 tsid, u16 tclass,
+				   struct av_decision *avd,
+				   struct avc_xperms_node *xp_node)
 {
 	struct avc_node *pos, *node = NULL;
 	int hvalue;
 	unsigned long flag;
 
-	if (avc_latest_notif_update(avd->seqno, 1))
+	if (avc_latest_notif_update(avc, avd->seqno, 1))
 		goto out;
 
-	node = avc_alloc_node();
+	node = avc_alloc_node(avc);
 	if (node) {
 		struct hlist_head *head;
 		spinlock_t *lock;
@@ -680,15 +720,15 @@ static struct avc_node *avc_insert(u32 ssid, u32 tsid, u16 tclass,
 			kmem_cache_free(avc_node_cachep, node);
 			return NULL;
 		}
-		head = &avc_cache.slots[hvalue];
-		lock = &avc_cache.slots_lock[hvalue];
+		head = &avc->avc_cache.slots[hvalue];
+		lock = &avc->avc_cache.slots_lock[hvalue];
 
 		spin_lock_irqsave(lock, flag);
 		hlist_for_each_entry(pos, head, list) {
 			if (pos->ae.ssid == ssid &&
 			    pos->ae.tsid == tsid &&
 			    pos->ae.tclass == tclass) {
-				avc_node_replace(node, pos);
+				avc_node_replace(avc, node, pos);
 				goto found;
 			}
 		}
@@ -726,9 +766,10 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
 {
 	struct common_audit_data *ad = a;
 	audit_log_format(ab, " ");
-	avc_dump_query(ab, ad->selinux_audit_data->ssid,
-			   ad->selinux_audit_data->tsid,
-			   ad->selinux_audit_data->tclass);
+	avc_dump_query(ab, ad->selinux_audit_data->ns,
+		       ad->selinux_audit_data->ssid,
+		       ad->selinux_audit_data->tsid,
+		       ad->selinux_audit_data->tclass);
 	if (ad->selinux_audit_data->denied) {
 		audit_log_format(ab, " permissive=%u",
 				 ad->selinux_audit_data->result ? 0 : 1);
@@ -736,10 +777,11 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
 }
 
 /* This is the slow part of avc audit with big stack footprint */
-noinline int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
-		u32 requested, u32 audited, u32 denied, int result,
-		struct common_audit_data *a,
-		unsigned flags)
+noinline int slow_avc_audit(struct selinux_ns *ns,
+			    u32 ssid, u32 tsid, u16 tclass,
+			    u32 requested, u32 audited, u32 denied, int result,
+			    struct common_audit_data *a,
+			    unsigned int flags)
 {
 	struct common_audit_data stack_data;
 	struct selinux_audit_data sad;
@@ -767,6 +809,7 @@ noinline int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
 	sad.audited = audited;
 	sad.denied = denied;
 	sad.result = result;
+	sad.ns = ns;
 
 	a->selinux_audit_data = &sad;
 
@@ -815,10 +858,11 @@ int __init avc_add_callback(int (*callback)(u32 event), u32 events)
  * otherwise, this function updates the AVC entry. The original AVC-entry object
  * will release later by RCU.
  */
-static int avc_update_node(u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
-			u32 tsid, u16 tclass, u32 seqno,
-			struct extended_perms_decision *xpd,
-			u32 flags)
+static int avc_update_node(struct selinux_avc *avc,
+			   u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
+			   u32 tsid, u16 tclass, u32 seqno,
+			   struct extended_perms_decision *xpd,
+			   u32 flags)
 {
 	int hvalue, rc = 0;
 	unsigned long flag;
@@ -826,7 +870,7 @@ static int avc_update_node(u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
 	struct hlist_head *head;
 	spinlock_t *lock;
 
-	node = avc_alloc_node();
+	node = avc_alloc_node(avc);
 	if (!node) {
 		rc = -ENOMEM;
 		goto out;
@@ -835,8 +879,8 @@ static int avc_update_node(u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
 	/* Lock the target slot */
 	hvalue = avc_hash(ssid, tsid, tclass);
 
-	head = &avc_cache.slots[hvalue];
-	lock = &avc_cache.slots_lock[hvalue];
+	head = &avc->avc_cache.slots[hvalue];
+	lock = &avc->avc_cache.slots_lock[hvalue];
 
 	spin_lock_irqsave(lock, flag);
 
@@ -852,7 +896,7 @@ static int avc_update_node(u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
 
 	if (!orig) {
 		rc = -ENOENT;
-		avc_node_kill(node);
+		avc_node_kill(avc, node);
 		goto out_unlock;
 	}
 
@@ -896,7 +940,7 @@ static int avc_update_node(u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
 		avc_add_xperms_decision(node, xpd);
 		break;
 	}
-	avc_node_replace(node, orig);
+	avc_node_replace(avc, node, orig);
 out_unlock:
 	spin_unlock_irqrestore(lock, flag);
 out:
@@ -906,7 +950,7 @@ static int avc_update_node(u32 event, u32 perms, u8 driver, u8 xperm, u32 ssid,
 /**
  * avc_flush - Flush the cache
  */
-static void avc_flush(void)
+static void avc_flush(struct selinux_avc *avc)
 {
 	struct hlist_head *head;
 	struct avc_node *node;
@@ -915,8 +959,8 @@ static void avc_flush(void)
 	int i;
 
 	for (i = 0; i < AVC_CACHE_SLOTS; i++) {
-		head = &avc_cache.slots[i];
-		lock = &avc_cache.slots_lock[i];
+		head = &avc->avc_cache.slots[i];
+		lock = &avc->avc_cache.slots_lock[i];
 
 		spin_lock_irqsave(lock, flag);
 		/*
@@ -925,7 +969,7 @@ static void avc_flush(void)
 		 */
 		rcu_read_lock();
 		hlist_for_each_entry(node, head, list)
-			avc_node_delete(node);
+			avc_node_delete(avc, node);
 		rcu_read_unlock();
 		spin_unlock_irqrestore(lock, flag);
 	}
@@ -935,12 +979,12 @@ static void avc_flush(void)
  * avc_ss_reset - Flush the cache and revalidate migrated permissions.
  * @seqno: policy sequence number
  */
-int avc_ss_reset(u32 seqno)
+int avc_ss_reset(struct selinux_avc *avc, u32 seqno)
 {
 	struct avc_callback_node *c;
 	int rc = 0, tmprc;
 
-	avc_flush();
+	avc_flush(avc);
 
 	for (c = avc_callbacks; c; c = c->next) {
 		if (c->events & AVC_CALLBACK_RESET) {
@@ -952,7 +996,7 @@ int avc_ss_reset(u32 seqno)
 		}
 	}
 
-	avc_latest_notif_update(seqno, 0);
+	avc_latest_notif_update(avc, seqno, 0);
 	return rc;
 }
 
@@ -965,31 +1009,33 @@ int avc_ss_reset(u32 seqno)
  * Don't inline this, since it's the slow-path and just
  * results in a bigger stack frame.
  */
-static noinline struct avc_node *avc_compute_av(u32 ssid, u32 tsid,
-			 u16 tclass, struct av_decision *avd,
-			 struct avc_xperms_node *xp_node)
+static noinline struct avc_node *avc_compute_av(struct selinux_ns *ns,
+						u32 ssid, u32 tsid,
+						u16 tclass,
+						struct av_decision *avd,
+						struct avc_xperms_node *xp_node)
 {
 	rcu_read_unlock();
 	INIT_LIST_HEAD(&xp_node->xpd_head);
-	security_compute_av(current_selinux_ns, ssid, tsid, tclass,
-			    avd, &xp_node->xp);
+	security_compute_av(ns, ssid, tsid, tclass, avd, &xp_node->xp);
 	rcu_read_lock();
-	return avc_insert(ssid, tsid, tclass, avd, xp_node);
+	return avc_insert(ns->avc, ssid, tsid, tclass, avd, xp_node);
 }
 
-static noinline int avc_denied(u32 ssid, u32 tsid,
-				u16 tclass, u32 requested,
-				u8 driver, u8 xperm, unsigned flags,
-				struct av_decision *avd)
+static noinline int avc_denied(struct selinux_ns *ns,
+			       u32 ssid, u32 tsid,
+			       u16 tclass, u32 requested,
+			       u8 driver, u8 xperm, unsigned int flags,
+			       struct av_decision *avd)
 {
 	if (flags & AVC_STRICT)
 		return -EACCES;
 
-	if (selinux_enforcing && !(avd->flags & AVD_FLAGS_PERMISSIVE))
+	if (ns_enforcing(ns) && !(avd->flags & AVD_FLAGS_PERMISSIVE))
 		return -EACCES;
 
-	avc_update_node(AVC_CALLBACK_GRANT, requested, driver, xperm, ssid,
-				tsid, tclass, avd->seqno, NULL, flags);
+	avc_update_node(ns->avc, AVC_CALLBACK_GRANT, requested, driver, xperm,
+			ssid, tsid, tclass, avd->seqno, NULL, flags);
 	return 0;
 }
 
@@ -1000,8 +1046,9 @@ static noinline int avc_denied(u32 ssid, u32 tsid,
  * as-is the case with ioctls, then multiple may be chained together and the
  * driver field is used to specify which set contains the permission.
  */
-int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
-			u8 driver, u8 xperm, struct common_audit_data *ad)
+int avc_has_extended_perms(struct selinux_ns *ns,
+			   u32 ssid, u32 tsid, u16 tclass, u32 requested,
+			   u8 driver, u8 xperm, struct common_audit_data *ad)
 {
 	struct avc_node *node;
 	struct av_decision avd;
@@ -1020,9 +1067,9 @@ int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
 
 	rcu_read_lock();
 
-	node = avc_lookup(ssid, tsid, tclass);
+	node = avc_lookup(ns->avc, ssid, tsid, tclass);
 	if (unlikely(!node)) {
-		node = avc_compute_av(ssid, tsid, tclass, &avd, xp_node);
+		node = avc_compute_av(ns, ssid, tsid, tclass, &avd, xp_node);
 	} else {
 		memcpy(&avd, &node->ae.avd, sizeof(avd));
 		xp_node = node->ae.xp_node;
@@ -1046,11 +1093,12 @@ int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
 			goto decision;
 		}
 		rcu_read_unlock();
-		security_compute_xperms_decision(current_selinux_ns, ssid, tsid,
-						 tclass, driver, &local_xpd);
+		security_compute_xperms_decision(ns, ssid, tsid, tclass, driver,
+						 &local_xpd);
 		rcu_read_lock();
-		avc_update_node(AVC_CALLBACK_ADD_XPERMS, requested, driver, xperm,
-				ssid, tsid, tclass, avd.seqno, &local_xpd, 0);
+		avc_update_node(ns->avc, AVC_CALLBACK_ADD_XPERMS, requested,
+				driver, xperm, ssid, tsid, tclass, avd.seqno,
+				&local_xpd, 0);
 	} else {
 		avc_quick_copy_xperms_decision(xperm, &local_xpd, xpd);
 	}
@@ -1062,12 +1110,12 @@ int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
 decision:
 	denied = requested & ~(avd.allowed);
 	if (unlikely(denied))
-		rc = avc_denied(ssid, tsid, tclass, requested, driver, xperm,
-				AVC_EXTENDED_PERMS, &avd);
+		rc = avc_denied(ns, ssid, tsid, tclass, requested,
+				driver, xperm, AVC_EXTENDED_PERMS, &avd);
 
 	rcu_read_unlock();
 
-	rc2 = avc_xperms_audit(ssid, tsid, tclass, requested,
+	rc2 = avc_xperms_audit(ns, ssid, tsid, tclass, requested,
 			&avd, xpd, xperm, rc, ad);
 	if (rc2)
 		return rc2;
@@ -1094,10 +1142,11 @@ int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
  * auditing, e.g. in cases where a lock must be held for the check but
  * should be released for the auditing.
  */
-inline int avc_has_perm_noaudit(u32 ssid, u32 tsid,
-			 u16 tclass, u32 requested,
-			 unsigned flags,
-			 struct av_decision *avd)
+inline int avc_has_perm_noaudit(struct selinux_ns *ns,
+				u32 ssid, u32 tsid,
+				u16 tclass, u32 requested,
+				unsigned int flags,
+				struct av_decision *avd)
 {
 	struct avc_node *node;
 	struct avc_xperms_node xp_node;
@@ -1108,15 +1157,16 @@ inline int avc_has_perm_noaudit(u32 ssid, u32 tsid,
 
 	rcu_read_lock();
 
-	node = avc_lookup(ssid, tsid, tclass);
+	node = avc_lookup(ns->avc, ssid, tsid, tclass);
 	if (unlikely(!node))
-		node = avc_compute_av(ssid, tsid, tclass, avd, &xp_node);
+		node = avc_compute_av(ns, ssid, tsid, tclass, avd, &xp_node);
 	else
 		memcpy(avd, &node->ae.avd, sizeof(*avd));
 
 	denied = requested & ~(avd->allowed);
 	if (unlikely(denied))
-		rc = avc_denied(ssid, tsid, tclass, requested, 0, 0, flags, avd);
+		rc = avc_denied(ns, ssid, tsid, tclass, requested, 0, 0,
+				flags, avd);
 
 	rcu_read_unlock();
 	return rc;
@@ -1138,39 +1188,40 @@ inline int avc_has_perm_noaudit(u32 ssid, u32 tsid,
  * permissions are granted, -%EACCES if any permissions are denied, or
  * another -errno upon other errors.
  */
-int avc_has_perm(u32 ssid, u32 tsid, u16 tclass,
+int avc_has_perm(struct selinux_ns *ns, u32 ssid, u32 tsid, u16 tclass,
 		 u32 requested, struct common_audit_data *auditdata)
 {
 	struct av_decision avd;
 	int rc, rc2;
 
-	rc = avc_has_perm_noaudit(ssid, tsid, tclass, requested, 0, &avd);
+	rc = avc_has_perm_noaudit(ns, ssid, tsid, tclass, requested, 0, &avd);
 
-	rc2 = avc_audit(ssid, tsid, tclass, requested, &avd, rc, auditdata, 0);
+	rc2 = avc_audit(ns, ssid, tsid, tclass, requested, &avd, rc,
+			auditdata, 0);
 	if (rc2)
 		return rc2;
 	return rc;
 }
 
-int avc_has_perm_flags(u32 ssid, u32 tsid, u16 tclass,
+int avc_has_perm_flags(struct selinux_ns *ns, u32 ssid, u32 tsid, u16 tclass,
 		       u32 requested, struct common_audit_data *auditdata,
 		       int flags)
 {
 	struct av_decision avd;
 	int rc, rc2;
 
-	rc = avc_has_perm_noaudit(ssid, tsid, tclass, requested, 0, &avd);
+	rc = avc_has_perm_noaudit(ns, ssid, tsid, tclass, requested, 0, &avd);
 
-	rc2 = avc_audit(ssid, tsid, tclass, requested, &avd, rc,
+	rc2 = avc_audit(ns, ssid, tsid, tclass, requested, &avd, rc,
 			auditdata, flags);
 	if (rc2)
 		return rc2;
 	return rc;
 }
 
-u32 avc_policy_seqno(void)
+u32 avc_policy_seqno(struct selinux_ns *ns)
 {
-	return avc_cache.latest_notif;
+	return ns->avc->avc_cache.latest_notif;
 }
 
 void avc_disable(void)
@@ -1187,7 +1238,7 @@ void avc_disable(void)
 	 * the cache and get that memory back.
 	 */
 	if (avc_node_cachep) {
-		avc_flush();
+		avc_flush(init_selinux_ns->avc);
 		/* kmem_cache_destroy(avc_node_cachep); */
 	}
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9eb48a1..25f5147 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -462,12 +462,14 @@ static int may_context_mount_sb_relabel(u32 sid,
 	const struct task_security_struct *tsec = cred->security;
 	int rc;
 
-	rc = avc_has_perm(tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
+	rc = avc_has_perm(current_selinux_ns,
+			  tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELFROM, NULL);
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(tsec->sid, sid, SECCLASS_FILESYSTEM,
+	rc = avc_has_perm(current_selinux_ns,
+			  tsec->sid, sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELTO, NULL);
 	return rc;
 }
@@ -478,12 +480,14 @@ static int may_context_mount_inode_relabel(u32 sid,
 {
 	const struct task_security_struct *tsec = cred->security;
 	int rc;
-	rc = avc_has_perm(tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
+	rc = avc_has_perm(current_selinux_ns,
+			  tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELFROM, NULL);
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(sid, sbsec->sid, SECCLASS_FILESYSTEM,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__ASSOCIATE, NULL);
 	return rc;
 }
@@ -1768,9 +1772,11 @@ static int cred_has_capability(const struct cred *cred,
 		return -EINVAL;
 	}
 
-	rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd);
+	rc = avc_has_perm_noaudit(current_selinux_ns,
+				  sid, sid, sclass, av, 0, &avd);
 	if (audit == SECURITY_CAP_AUDIT) {
-		int rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad, 0);
+		int rc2 = avc_audit(current_selinux_ns,
+				    sid, sid, sclass, av, &avd, rc, &ad, 0);
 		if (rc2)
 			return rc2;
 	}
@@ -1796,7 +1802,8 @@ static int inode_has_perm(const struct cred *cred,
 	sid = cred_sid(cred);
 	isec = inode->i_security;
 
-	return avc_has_perm(sid, isec->sid, isec->sclass, perms, adp);
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, isec->sclass, perms, adp);
 }
 
 /* Same as inode_has_perm, but pass explicit audit data containing
@@ -1865,7 +1872,8 @@ static int file_has_perm(const struct cred *cred,
 	ad.u.file = file;
 
 	if (sid != fsec->sid) {
-		rc = avc_has_perm(sid, fsec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, fsec->sid,
 				  SECCLASS_FD,
 				  FD__USE,
 				  &ad);
@@ -1929,7 +1937,8 @@ static int may_create(struct inode *dir,
 	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = dentry;
 
-	rc = avc_has_perm(sid, dsec->sid, SECCLASS_DIR,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, dsec->sid, SECCLASS_DIR,
 			  DIR__ADD_NAME | DIR__SEARCH,
 			  &ad);
 	if (rc)
@@ -1940,11 +1949,13 @@ static int may_create(struct inode *dir,
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(sid, newsid, tclass, FILE__CREATE, &ad);
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, newsid, tclass, FILE__CREATE, &ad);
 	if (rc)
 		return rc;
 
-	return avc_has_perm(newsid, sbsec->sid,
+	return avc_has_perm(current_selinux_ns,
+			    newsid, sbsec->sid,
 			    SECCLASS_FILESYSTEM,
 			    FILESYSTEM__ASSOCIATE, &ad);
 }
@@ -1973,7 +1984,8 @@ static int may_link(struct inode *dir,
 
 	av = DIR__SEARCH;
 	av |= (kind ? DIR__REMOVE_NAME : DIR__ADD_NAME);
-	rc = avc_has_perm(sid, dsec->sid, SECCLASS_DIR, av, &ad);
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, dsec->sid, SECCLASS_DIR, av, &ad);
 	if (rc)
 		return rc;
 
@@ -1993,7 +2005,8 @@ static int may_link(struct inode *dir,
 		return 0;
 	}
 
-	rc = avc_has_perm(sid, isec->sid, isec->sclass, av, &ad);
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid, isec->sclass, av, &ad);
 	return rc;
 }
 
@@ -2017,16 +2030,19 @@ static inline int may_rename(struct inode *old_dir,
 	ad.type = LSM_AUDIT_DATA_DENTRY;
 
 	ad.u.dentry = old_dentry;
-	rc = avc_has_perm(sid, old_dsec->sid, SECCLASS_DIR,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, old_dsec->sid, SECCLASS_DIR,
 			  DIR__REMOVE_NAME | DIR__SEARCH, &ad);
 	if (rc)
 		return rc;
-	rc = avc_has_perm(sid, old_isec->sid,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, old_isec->sid,
 			  old_isec->sclass, FILE__RENAME, &ad);
 	if (rc)
 		return rc;
 	if (old_is_dir && new_dir != old_dir) {
-		rc = avc_has_perm(sid, old_isec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, old_isec->sid,
 				  old_isec->sclass, DIR__REPARENT, &ad);
 		if (rc)
 			return rc;
@@ -2036,13 +2052,15 @@ static inline int may_rename(struct inode *old_dir,
 	av = DIR__ADD_NAME | DIR__SEARCH;
 	if (d_is_positive(new_dentry))
 		av |= DIR__REMOVE_NAME;
-	rc = avc_has_perm(sid, new_dsec->sid, SECCLASS_DIR, av, &ad);
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, new_dsec->sid, SECCLASS_DIR, av, &ad);
 	if (rc)
 		return rc;
 	if (d_is_positive(new_dentry)) {
 		new_isec = backing_inode_security(new_dentry);
 		new_is_dir = d_is_dir(new_dentry);
-		rc = avc_has_perm(sid, new_isec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, new_isec->sid,
 				  new_isec->sclass,
 				  (new_is_dir ? DIR__RMDIR : FILE__UNLINK), &ad);
 		if (rc)
@@ -2062,7 +2080,8 @@ static int superblock_has_perm(const struct cred *cred,
 	u32 sid = cred_sid(cred);
 
 	sbsec = sb->s_security;
-	return avc_has_perm(sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
+	return avc_has_perm(current_selinux_ns,
+			    sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
 }
 
 /* Convert a Linux mode and permission mask to an access vector. */
@@ -2138,7 +2157,8 @@ static int selinux_binder_set_context_mgr(struct task_struct *mgr)
 	u32 mysid = current_sid();
 	u32 mgrsid = task_sid(mgr);
 
-	return avc_has_perm(mysid, mgrsid, SECCLASS_BINDER,
+	return avc_has_perm(current_selinux_ns,
+			    mysid, mgrsid, SECCLASS_BINDER,
 			    BINDER__SET_CONTEXT_MGR, NULL);
 }
 
@@ -2151,13 +2171,15 @@ static int selinux_binder_transaction(struct task_struct *from,
 	int rc;
 
 	if (mysid != fromsid) {
-		rc = avc_has_perm(mysid, fromsid, SECCLASS_BINDER,
+		rc = avc_has_perm(current_selinux_ns,
+				  mysid, fromsid, SECCLASS_BINDER,
 				  BINDER__IMPERSONATE, NULL);
 		if (rc)
 			return rc;
 	}
 
-	return avc_has_perm(fromsid, tosid, SECCLASS_BINDER, BINDER__CALL,
+	return avc_has_perm(current_selinux_ns,
+			    fromsid, tosid, SECCLASS_BINDER, BINDER__CALL,
 			    NULL);
 }
 
@@ -2167,7 +2189,8 @@ static int selinux_binder_transfer_binder(struct task_struct *from,
 	u32 fromsid = task_sid(from);
 	u32 tosid = task_sid(to);
 
-	return avc_has_perm(fromsid, tosid, SECCLASS_BINDER, BINDER__TRANSFER,
+	return avc_has_perm(current_selinux_ns,
+			    fromsid, tosid, SECCLASS_BINDER, BINDER__TRANSFER,
 			    NULL);
 }
 
@@ -2186,7 +2209,8 @@ static int selinux_binder_transfer_file(struct task_struct *from,
 	ad.u.path = file->f_path;
 
 	if (sid != fsec->sid) {
-		rc = avc_has_perm(sid, fsec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, fsec->sid,
 				  SECCLASS_FD,
 				  FD__USE,
 				  &ad);
@@ -2198,7 +2222,8 @@ static int selinux_binder_transfer_file(struct task_struct *from,
 		return 0;
 
 	isec = backing_inode_security(dentry);
-	return avc_has_perm(sid, isec->sid, isec->sclass, file_to_av(file),
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, isec->sclass, file_to_av(file),
 			    &ad);
 }
 
@@ -2209,21 +2234,25 @@ static int selinux_ptrace_access_check(struct task_struct *child,
 	u32 csid = task_sid(child);
 
 	if (mode & PTRACE_MODE_READ)
-		return avc_has_perm(sid, csid, SECCLASS_FILE, FILE__READ, NULL);
+		return avc_has_perm(current_selinux_ns,
+				    sid, csid, SECCLASS_FILE, FILE__READ, NULL);
 
-	return avc_has_perm(sid, csid, SECCLASS_PROCESS, PROCESS__PTRACE, NULL);
+	return avc_has_perm(current_selinux_ns,
+			    sid, csid, SECCLASS_PROCESS, PROCESS__PTRACE, NULL);
 }
 
 static int selinux_ptrace_traceme(struct task_struct *parent)
 {
-	return avc_has_perm(task_sid(parent), current_sid(), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    task_sid(parent), current_sid(), SECCLASS_PROCESS,
 			    PROCESS__PTRACE, NULL);
 }
 
 static int selinux_capget(struct task_struct *target, kernel_cap_t *effective,
 			  kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
-	return avc_has_perm(current_sid(), task_sid(target), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(target), SECCLASS_PROCESS,
 			    PROCESS__GETCAP, NULL);
 }
 
@@ -2232,7 +2261,8 @@ static int selinux_capset(struct cred *new, const struct cred *old,
 			  const kernel_cap_t *inheritable,
 			  const kernel_cap_t *permitted)
 {
-	return avc_has_perm(cred_sid(old), cred_sid(new), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    cred_sid(old), cred_sid(new), SECCLASS_PROCESS,
 			    PROCESS__SETCAP, NULL);
 }
 
@@ -2292,18 +2322,21 @@ static int selinux_syslog(int type)
 	switch (type) {
 	case SYSLOG_ACTION_READ_ALL:	/* Read last kernel messages */
 	case SYSLOG_ACTION_SIZE_BUFFER:	/* Return size of the log buffer */
-		return avc_has_perm(current_sid(), SECINITSID_KERNEL,
+		return avc_has_perm(current_selinux_ns,
+				    current_sid(), SECINITSID_KERNEL,
 				    SECCLASS_SYSTEM, SYSTEM__SYSLOG_READ, NULL);
 	case SYSLOG_ACTION_CONSOLE_OFF:	/* Disable logging to console */
 	case SYSLOG_ACTION_CONSOLE_ON:	/* Enable logging to console */
 	/* Set level of messages printed to console */
 	case SYSLOG_ACTION_CONSOLE_LEVEL:
-		return avc_has_perm(current_sid(), SECINITSID_KERNEL,
+		return avc_has_perm(current_selinux_ns,
+				    current_sid(), SECINITSID_KERNEL,
 				    SECCLASS_SYSTEM, SYSTEM__SYSLOG_CONSOLE,
 				    NULL);
 	}
 	/* All other syslog types */
-	return avc_has_perm(current_sid(), SECINITSID_KERNEL,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), SECINITSID_KERNEL,
 			    SECCLASS_SYSTEM, SYSTEM__SYSLOG_MOD, NULL);
 }
 
@@ -2370,7 +2403,8 @@ static int check_nnp_nosuid(const struct linux_binprm *bprm,
 			av |= PROCESS2__NNP_TRANSITION;
 		if (nosuid)
 			av |= PROCESS2__NOSUID_TRANSITION;
-		rc = avc_has_perm(old_tsec->sid, new_tsec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  old_tsec->sid, new_tsec->sid,
 				  SECCLASS_PROCESS2, av, NULL);
 		if (!rc)
 			return 0;
@@ -2453,25 +2487,29 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 	ad.u.file = bprm->file;
 
 	if (new_tsec->sid == old_tsec->sid) {
-		rc = avc_has_perm(old_tsec->sid, isec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  old_tsec->sid, isec->sid,
 				  SECCLASS_FILE, FILE__EXECUTE_NO_TRANS, &ad);
 		if (rc)
 			return rc;
 	} else {
 		/* Check permissions for the transition. */
-		rc = avc_has_perm(old_tsec->sid, new_tsec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  old_tsec->sid, new_tsec->sid,
 				  SECCLASS_PROCESS, PROCESS__TRANSITION, &ad);
 		if (rc)
 			return rc;
 
-		rc = avc_has_perm(new_tsec->sid, isec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  new_tsec->sid, isec->sid,
 				  SECCLASS_FILE, FILE__ENTRYPOINT, &ad);
 		if (rc)
 			return rc;
 
 		/* Check for shared state */
 		if (bprm->unsafe & LSM_UNSAFE_SHARE) {
-			rc = avc_has_perm(old_tsec->sid, new_tsec->sid,
+			rc = avc_has_perm(current_selinux_ns,
+					  old_tsec->sid, new_tsec->sid,
 					  SECCLASS_PROCESS, PROCESS__SHARE,
 					  NULL);
 			if (rc)
@@ -2483,7 +2521,8 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 		if (bprm->unsafe & LSM_UNSAFE_PTRACE) {
 			u32 ptsid = ptrace_parent_sid();
 			if (ptsid != 0) {
-				rc = avc_has_perm(ptsid, new_tsec->sid,
+				rc = avc_has_perm(current_selinux_ns,
+						  ptsid, new_tsec->sid,
 						  SECCLASS_PROCESS,
 						  PROCESS__PTRACE, NULL);
 				if (rc)
@@ -2497,7 +2536,8 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 		/* Enable secure mode for SIDs transitions unless
 		   the noatsecure permission is granted between
 		   the two SIDs, i.e. ahp returns 0. */
-		rc = avc_has_perm(old_tsec->sid, new_tsec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  old_tsec->sid, new_tsec->sid,
 				  SECCLASS_PROCESS, PROCESS__NOATSECURE,
 				  NULL);
 		bprm->secureexec |= !!rc;
@@ -2589,7 +2629,8 @@ static void selinux_bprm_committing_creds(struct linux_binprm *bprm)
 	 * higher than the default soft limit for cases where the default is
 	 * lower than the hard limit, e.g. RLIMIT_CORE or RLIMIT_STACK.
 	 */
-	rc = avc_has_perm(new_tsec->osid, new_tsec->sid, SECCLASS_PROCESS,
+	rc = avc_has_perm(current_selinux_ns,
+			  new_tsec->osid, new_tsec->sid, SECCLASS_PROCESS,
 			  PROCESS__RLIMITINH, NULL);
 	if (rc) {
 		/* protect against do_prlimit() */
@@ -2629,7 +2670,8 @@ static void selinux_bprm_committed_creds(struct linux_binprm *bprm)
 	 * This must occur _after_ the task SID has been updated so that any
 	 * kill done after the flush will be checked against the new SID.
 	 */
-	rc = avc_has_perm(osid, sid, SECCLASS_PROCESS, PROCESS__SIGINH, NULL);
+	rc = avc_has_perm(current_selinux_ns,
+			  osid, sid, SECCLASS_PROCESS, PROCESS__SIGINH, NULL);
 	if (rc) {
 		if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
 			memset(&itimer, 0, sizeof itimer);
@@ -3059,7 +3101,8 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct inode *inode,
 	if (IS_ERR(isec))
 		return PTR_ERR(isec);
 
-	return avc_has_perm_flags(sid, isec->sid, isec->sclass, FILE__READ, &ad,
+	return avc_has_perm_flags(current_selinux_ns,
+				  sid, isec->sid, isec->sclass, FILE__READ, &ad,
 				  rcu ? MAY_NOT_BLOCK : 0);
 }
 
@@ -3075,7 +3118,8 @@ static noinline int audit_inode_permission(struct inode *inode,
 	ad.type = LSM_AUDIT_DATA_INODE;
 	ad.u.inode = inode;
 
-	rc = slow_avc_audit(current_sid(), isec->sid, isec->sclass, perms,
+	rc = slow_avc_audit(current_selinux_ns,
+			    current_sid(), isec->sid, isec->sclass, perms,
 			    audited, denied, result, &ad, flags);
 	if (rc)
 		return rc;
@@ -3113,7 +3157,8 @@ static int selinux_inode_permission(struct inode *inode, int mask)
 	if (IS_ERR(isec))
 		return PTR_ERR(isec);
 
-	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
+	rc = avc_has_perm_noaudit(current_selinux_ns,
+				  sid, isec->sid, isec->sclass, perms, 0, &avd);
 	audited = avc_audit_required(perms, &avd, rc,
 				     from_access ? FILE__AUDIT_ACCESS : 0,
 				     &denied);
@@ -3216,7 +3261,8 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	ad.u.dentry = dentry;
 
 	isec = backing_inode_security(dentry);
-	rc = avc_has_perm(sid, isec->sid, isec->sclass,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid, isec->sclass,
 			  FILE__RELABELFROM, &ad);
 	if (rc)
 		return rc;
@@ -3254,7 +3300,8 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(sid, newsid, isec->sclass,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, newsid, isec->sclass,
 			  FILE__RELABELTO, &ad);
 	if (rc)
 		return rc;
@@ -3264,7 +3311,8 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	if (rc)
 		return rc;
 
-	return avc_has_perm(newsid,
+	return avc_has_perm(current_selinux_ns,
+			    newsid,
 			    sbsec->sid,
 			    SECCLASS_FILESYSTEM,
 			    FILESYSTEM__ASSOCIATE,
@@ -3475,7 +3523,7 @@ static int selinux_file_permission(struct file *file, int mask)
 
 	isec = inode_security(inode);
 	if (sid == fsec->sid && fsec->isid == isec->sid &&
-	    fsec->pseqno == avc_policy_seqno())
+	    fsec->pseqno == avc_policy_seqno(current_selinux_ns))
 		/* No change since file_open check. */
 		return 0;
 
@@ -3515,7 +3563,8 @@ static int ioctl_has_perm(const struct cred *cred, struct file *file,
 	ad.u.op->path = file->f_path;
 
 	if (ssid != fsec->sid) {
-		rc = avc_has_perm(ssid, fsec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  ssid, fsec->sid,
 				SECCLASS_FD,
 				FD__USE,
 				&ad);
@@ -3527,8 +3576,9 @@ static int ioctl_has_perm(const struct cred *cred, struct file *file,
 		return 0;
 
 	isec = inode_security(inode);
-	rc = avc_has_extended_perms(ssid, isec->sid, isec->sclass,
-			requested, driver, xperm, &ad);
+	rc = avc_has_extended_perms(current_selinux_ns,
+				    ssid, isec->sid, isec->sclass,
+				    requested, driver, xperm, &ad);
 out:
 	return rc;
 }
@@ -3596,7 +3646,8 @@ static int file_map_prot_check(struct file *file, unsigned long prot, int shared
 		 * private file mapping that will also be writable.
 		 * This has an additional check.
 		 */
-		rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, sid, SECCLASS_PROCESS,
 				  PROCESS__EXECMEM, NULL);
 		if (rc)
 			goto error;
@@ -3626,7 +3677,8 @@ static int selinux_mmap_addr(unsigned long addr)
 
 	if (addr < CONFIG_LSM_MMAP_MIN_ADDR) {
 		u32 sid = current_sid();
-		rc = avc_has_perm(sid, sid, SECCLASS_MEMPROTECT,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, sid, SECCLASS_MEMPROTECT,
 				  MEMPROTECT__MMAP_ZERO, NULL);
 	}
 
@@ -3670,13 +3722,15 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 		int rc = 0;
 		if (vma->vm_start >= vma->vm_mm->start_brk &&
 		    vma->vm_end <= vma->vm_mm->brk) {
-			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
+			rc = avc_has_perm(current_selinux_ns,
+					  sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECHEAP, NULL);
 		} else if (!vma->vm_file &&
 			   ((vma->vm_start <= vma->vm_mm->start_stack &&
 			     vma->vm_end >= vma->vm_mm->start_stack) ||
 			    vma_is_stack_for_current(vma))) {
-			rc = avc_has_perm(sid, sid, SECCLASS_PROCESS,
+			rc = avc_has_perm(current_selinux_ns,
+					  sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECSTACK, NULL);
 		} else if (vma->vm_file && vma->anon_vma) {
 			/*
@@ -3768,7 +3822,8 @@ static int selinux_file_send_sigiotask(struct task_struct *tsk,
 	else
 		perm = signal_to_av(signum);
 
-	return avc_has_perm(fsec->fown_sid, sid,
+	return avc_has_perm(current_selinux_ns,
+			    fsec->fown_sid, sid,
 			    SECCLASS_PROCESS, perm, NULL);
 }
 
@@ -3794,7 +3849,7 @@ static int selinux_file_open(struct file *file, const struct cred *cred)
 	 * struct as its SID.
 	 */
 	fsec->isid = isec->sid;
-	fsec->pseqno = avc_policy_seqno();
+	fsec->pseqno = avc_policy_seqno(current_selinux_ns);
 	/*
 	 * Since the inode label or policy seqno may have changed
 	 * between the selinux_inode_permission check and the saving
@@ -3813,7 +3868,8 @@ static int selinux_task_alloc(struct task_struct *task,
 {
 	u32 sid = current_sid();
 
-	return avc_has_perm(sid, sid, SECCLASS_PROCESS, PROCESS__FORK, NULL);
+	return avc_has_perm(current_selinux_ns,
+			    sid, sid, SECCLASS_PROCESS, PROCESS__FORK, NULL);
 }
 
 /*
@@ -3887,7 +3943,8 @@ static int selinux_kernel_act_as(struct cred *new, u32 secid)
 	u32 sid = current_sid();
 	int ret;
 
-	ret = avc_has_perm(sid, secid,
+	ret = avc_has_perm(current_selinux_ns,
+			   sid, secid,
 			   SECCLASS_KERNEL_SERVICE,
 			   KERNEL_SERVICE__USE_AS_OVERRIDE,
 			   NULL);
@@ -3911,7 +3968,8 @@ static int selinux_kernel_create_files_as(struct cred *new, struct inode *inode)
 	u32 sid = current_sid();
 	int ret;
 
-	ret = avc_has_perm(sid, isec->sid,
+	ret = avc_has_perm(current_selinux_ns,
+			   sid, isec->sid,
 			   SECCLASS_KERNEL_SERVICE,
 			   KERNEL_SERVICE__CREATE_FILES_AS,
 			   NULL);
@@ -3928,7 +3986,8 @@ static int selinux_kernel_module_request(char *kmod_name)
 	ad.type = LSM_AUDIT_DATA_KMOD;
 	ad.u.kmod_name = kmod_name;
 
-	return avc_has_perm(current_sid(), SECINITSID_KERNEL, SECCLASS_SYSTEM,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), SECINITSID_KERNEL, SECCLASS_SYSTEM,
 			    SYSTEM__MODULE_REQUEST, &ad);
 }
 
@@ -3942,7 +4001,8 @@ static int selinux_kernel_module_from_file(struct file *file)
 
 	/* init_module */
 	if (file == NULL)
-		return avc_has_perm(sid, sid, SECCLASS_SYSTEM,
+		return avc_has_perm(current_selinux_ns,
+				    sid, sid, SECCLASS_SYSTEM,
 					SYSTEM__MODULE_LOAD, NULL);
 
 	/* finit_module */
@@ -3952,13 +4012,15 @@ static int selinux_kernel_module_from_file(struct file *file)
 
 	fsec = file->f_security;
 	if (sid != fsec->sid) {
-		rc = avc_has_perm(sid, fsec->sid, SECCLASS_FD, FD__USE, &ad);
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, fsec->sid, SECCLASS_FD, FD__USE, &ad);
 		if (rc)
 			return rc;
 	}
 
 	isec = inode_security(file_inode(file));
-	return avc_has_perm(sid, isec->sid, SECCLASS_SYSTEM,
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, SECCLASS_SYSTEM,
 				SYSTEM__MODULE_LOAD, &ad);
 }
 
@@ -3980,19 +4042,22 @@ static int selinux_kernel_read_file(struct file *file,
 
 static int selinux_task_setpgid(struct task_struct *p, pid_t pgid)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__SETPGID, NULL);
 }
 
 static int selinux_task_getpgid(struct task_struct *p)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__GETPGID, NULL);
 }
 
 static int selinux_task_getsid(struct task_struct *p)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__GETSESSION, NULL);
 }
 
@@ -4003,19 +4068,22 @@ static void selinux_task_getsecid(struct task_struct *p, u32 *secid)
 
 static int selinux_task_setnice(struct task_struct *p, int nice)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__SETSCHED, NULL);
 }
 
 static int selinux_task_setioprio(struct task_struct *p, int ioprio)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__SETSCHED, NULL);
 }
 
 static int selinux_task_getioprio(struct task_struct *p)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__GETSCHED, NULL);
 }
 
@@ -4030,7 +4098,8 @@ int selinux_task_prlimit(const struct cred *cred, const struct cred *tcred,
 		av |= PROCESS__SETRLIMIT;
 	if (flags & LSM_PRLIMIT_READ)
 		av |= PROCESS__GETRLIMIT;
-	return avc_has_perm(cred_sid(cred), cred_sid(tcred),
+	return avc_has_perm(current_selinux_ns,
+			    cred_sid(cred), cred_sid(tcred),
 			    SECCLASS_PROCESS, av, NULL);
 }
 
@@ -4044,7 +4113,8 @@ static int selinux_task_setrlimit(struct task_struct *p, unsigned int resource,
 	   later be used as a safe reset point for the soft limit
 	   upon context transitions.  See selinux_bprm_committing_creds. */
 	if (old_rlim->rlim_max != new_rlim->rlim_max)
-		return avc_has_perm(current_sid(), task_sid(p),
+		return avc_has_perm(current_selinux_ns,
+				    current_sid(), task_sid(p),
 				    SECCLASS_PROCESS, PROCESS__SETRLIMIT, NULL);
 
 	return 0;
@@ -4052,19 +4122,22 @@ static int selinux_task_setrlimit(struct task_struct *p, unsigned int resource,
 
 static int selinux_task_setscheduler(struct task_struct *p)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__SETSCHED, NULL);
 }
 
 static int selinux_task_getscheduler(struct task_struct *p)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__GETSCHED, NULL);
 }
 
 static int selinux_task_movememory(struct task_struct *p)
 {
-	return avc_has_perm(current_sid(), task_sid(p), SECCLASS_PROCESS,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), task_sid(p), SECCLASS_PROCESS,
 			    PROCESS__SETSCHED, NULL);
 }
 
@@ -4079,7 +4152,8 @@ static int selinux_task_kill(struct task_struct *p, struct siginfo *info,
 		perm = signal_to_av(sig);
 	if (!secid)
 		secid = current_sid();
-	return avc_has_perm(secid, task_sid(p), SECCLASS_PROCESS, perm, NULL);
+	return avc_has_perm(current_selinux_ns,
+			    secid, task_sid(p), SECCLASS_PROCESS, perm, NULL);
 }
 
 static void selinux_task_to_inode(struct task_struct *p,
@@ -4384,7 +4458,8 @@ static int sock_has_perm(struct sock *sk, u32 perms)
 	ad.u.net = &net;
 	ad.u.net->sk = sk;
 
-	return avc_has_perm(current_sid(), sksec->sid, sksec->sclass, perms,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), sksec->sid, sksec->sclass, perms,
 			    &ad);
 }
 
@@ -4404,7 +4479,8 @@ static int selinux_socket_create(int family, int type,
 	if (rc)
 		return rc;
 
-	return avc_has_perm(tsec->sid, newsid, secclass, SOCKET__CREATE, NULL);
+	return avc_has_perm(current_selinux_ns,
+			    tsec->sid, newsid, secclass, SOCKET__CREATE, NULL);
 }
 
 static int selinux_socket_post_create(struct socket *sock, int family,
@@ -4500,7 +4576,8 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 				ad.u.net = &net;
 				ad.u.net->sport = htons(snum);
 				ad.u.net->family = family;
-				err = avc_has_perm(sksec->sid, sid,
+				err = avc_has_perm(current_selinux_ns,
+						   sksec->sid, sid,
 						   sksec->sclass,
 						   SOCKET__NAME_BIND, &ad);
 				if (err)
@@ -4540,7 +4617,8 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		else
 			ad.u.net->v6info.saddr = addr6->sin6_addr;
 
-		err = avc_has_perm(sksec->sid, sid,
+		err = avc_has_perm(current_selinux_ns,
+				   sksec->sid, sid,
 				   sksec->sclass, node_perm, &ad);
 		if (err)
 			goto out;
@@ -4594,7 +4672,8 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
 		ad.u.net = &net;
 		ad.u.net->dport = htons(snum);
 		ad.u.net->family = sk->sk_family;
-		err = avc_has_perm(sksec->sid, sid, sksec->sclass, perm, &ad);
+		err = avc_has_perm(current_selinux_ns,
+				   sksec->sid, sid, sksec->sclass, perm, &ad);
 		if (err)
 			goto out;
 	}
@@ -4695,7 +4774,8 @@ static int selinux_socket_unix_stream_connect(struct sock *sock,
 	ad.u.net = &net;
 	ad.u.net->sk = other;
 
-	err = avc_has_perm(sksec_sock->sid, sksec_other->sid,
+	err = avc_has_perm(current_selinux_ns,
+			   sksec_sock->sid, sksec_other->sid,
 			   sksec_other->sclass,
 			   UNIX_STREAM_SOCKET__CONNECTTO, &ad);
 	if (err)
@@ -4726,7 +4806,8 @@ static int selinux_socket_unix_may_send(struct socket *sock,
 	ad.u.net = &net;
 	ad.u.net->sk = other->sk;
 
-	return avc_has_perm(ssec->sid, osec->sid, osec->sclass, SOCKET__SENDTO,
+	return avc_has_perm(current_selinux_ns,
+			    ssec->sid, osec->sid, osec->sclass, SOCKET__SENDTO,
 			    &ad);
 }
 
@@ -4741,7 +4822,8 @@ static int selinux_inet_sys_rcv_skb(struct net *ns, int ifindex,
 	err = sel_netif_sid(ns, ifindex, &if_sid);
 	if (err)
 		return err;
-	err = avc_has_perm(peer_sid, if_sid,
+	err = avc_has_perm(current_selinux_ns,
+			   peer_sid, if_sid,
 			   SECCLASS_NETIF, NETIF__INGRESS, ad);
 	if (err)
 		return err;
@@ -4749,7 +4831,8 @@ static int selinux_inet_sys_rcv_skb(struct net *ns, int ifindex,
 	err = sel_netnode_sid(addrp, family, &node_sid);
 	if (err)
 		return err;
-	return avc_has_perm(peer_sid, node_sid,
+	return avc_has_perm(current_selinux_ns,
+			    peer_sid, node_sid,
 			    SECCLASS_NODE, NODE__RECVFROM, ad);
 }
 
@@ -4772,7 +4855,8 @@ static int selinux_sock_rcv_skb_compat(struct sock *sk, struct sk_buff *skb,
 		return err;
 
 	if (selinux_secmark_enabled()) {
-		err = avc_has_perm(sk_sid, skb->secmark, SECCLASS_PACKET,
+		err = avc_has_perm(current_selinux_ns,
+				   sk_sid, skb->secmark, SECCLASS_PACKET,
 				   PACKET__RECV, &ad);
 		if (err)
 			return err;
@@ -4837,7 +4921,8 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
 			selinux_netlbl_err(skb, family, err, 0);
 			return err;
 		}
-		err = avc_has_perm(sk_sid, peer_sid, SECCLASS_PEER,
+		err = avc_has_perm(current_selinux_ns,
+				   sk_sid, peer_sid, SECCLASS_PEER,
 				   PEER__RECV, &ad);
 		if (err) {
 			selinux_netlbl_err(skb, family, err, 0);
@@ -4846,7 +4931,8 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	}
 
 	if (secmark_active) {
-		err = avc_has_perm(sk_sid, skb->secmark, SECCLASS_PACKET,
+		err = avc_has_perm(current_selinux_ns,
+				   sk_sid, skb->secmark, SECCLASS_PACKET,
 				   PACKET__RECV, &ad);
 		if (err)
 			return err;
@@ -5037,7 +5123,9 @@ static int selinux_secmark_relabel_packet(u32 sid)
 	__tsec = current_security();
 	tsid = __tsec->sid;
 
-	return avc_has_perm(tsid, sid, SECCLASS_PACKET, PACKET__RELABELTO, NULL);
+	return avc_has_perm(current_selinux_ns,
+			    tsid, sid, SECCLASS_PACKET, PACKET__RELABELTO,
+			    NULL);
 }
 
 static void selinux_secmark_refcount_inc(void)
@@ -5085,7 +5173,8 @@ static int selinux_tun_dev_create(void)
 	 * connections unlike traditional sockets - check the TUN driver to
 	 * get a better understanding of why this socket is special */
 
-	return avc_has_perm(sid, sid, SECCLASS_TUN_SOCKET, TUN_SOCKET__CREATE,
+	return avc_has_perm(current_selinux_ns,
+			    sid, sid, SECCLASS_TUN_SOCKET, TUN_SOCKET__CREATE,
 			    NULL);
 }
 
@@ -5093,7 +5182,8 @@ static int selinux_tun_dev_attach_queue(void *security)
 {
 	struct tun_security_struct *tunsec = security;
 
-	return avc_has_perm(current_sid(), tunsec->sid, SECCLASS_TUN_SOCKET,
+	return avc_has_perm(current_selinux_ns,
+			    current_sid(), tunsec->sid, SECCLASS_TUN_SOCKET,
 			    TUN_SOCKET__ATTACH_QUEUE, NULL);
 }
 
@@ -5121,11 +5211,13 @@ static int selinux_tun_dev_open(void *security)
 	u32 sid = current_sid();
 	int err;
 
-	err = avc_has_perm(sid, tunsec->sid, SECCLASS_TUN_SOCKET,
+	err = avc_has_perm(current_selinux_ns,
+			   sid, tunsec->sid, SECCLASS_TUN_SOCKET,
 			   TUN_SOCKET__RELABELFROM, NULL);
 	if (err)
 		return err;
-	err = avc_has_perm(sid, sid, SECCLASS_TUN_SOCKET,
+	err = avc_has_perm(current_selinux_ns,
+			   sid, sid, SECCLASS_TUN_SOCKET,
 			   TUN_SOCKET__RELABELTO, NULL);
 	if (err)
 		return err;
@@ -5216,7 +5308,8 @@ static unsigned int selinux_ip_forward(struct sk_buff *skb,
 	}
 
 	if (secmark_active)
-		if (avc_has_perm(peer_sid, skb->secmark,
+		if (avc_has_perm(current_selinux_ns,
+				 peer_sid, skb->secmark,
 				 SECCLASS_PACKET, PACKET__FORWARD_IN, &ad))
 			return NF_DROP;
 
@@ -5328,7 +5421,8 @@ static unsigned int selinux_ip_postroute_compat(struct sk_buff *skb,
 		return NF_DROP;
 
 	if (selinux_secmark_enabled())
-		if (avc_has_perm(sksec->sid, skb->secmark,
+		if (avc_has_perm(current_selinux_ns,
+				 sksec->sid, skb->secmark,
 				 SECCLASS_PACKET, PACKET__SEND, &ad))
 			return NF_DROP_ERR(-ECONNREFUSED);
 
@@ -5451,7 +5545,8 @@ static unsigned int selinux_ip_postroute(struct sk_buff *skb,
 		return NF_DROP;
 
 	if (secmark_active)
-		if (avc_has_perm(peer_sid, skb->secmark,
+		if (avc_has_perm(current_selinux_ns,
+				 peer_sid, skb->secmark,
 				 SECCLASS_PACKET, secmark_perm, &ad))
 			return NF_DROP_ERR(-ECONNREFUSED);
 
@@ -5461,13 +5556,15 @@ static unsigned int selinux_ip_postroute(struct sk_buff *skb,
 
 		if (sel_netif_sid(dev_net(outdev), ifindex, &if_sid))
 			return NF_DROP;
-		if (avc_has_perm(peer_sid, if_sid,
+		if (avc_has_perm(current_selinux_ns,
+				 peer_sid, if_sid,
 				 SECCLASS_NETIF, NETIF__EGRESS, &ad))
 			return NF_DROP_ERR(-ECONNREFUSED);
 
 		if (sel_netnode_sid(addrp, family, &node_sid))
 			return NF_DROP;
-		if (avc_has_perm(peer_sid, node_sid,
+		if (avc_has_perm(current_selinux_ns,
+				 peer_sid, node_sid,
 				 SECCLASS_NODE, NODE__SENDTO, &ad))
 			return NF_DROP_ERR(-ECONNREFUSED);
 	}
@@ -5555,7 +5652,8 @@ static int ipc_has_perm(struct kern_ipc_perm *ipc_perms,
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = ipc_perms->key;
 
-	return avc_has_perm(sid, isec->sid, isec->sclass, perms, &ad);
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, isec->sclass, perms, &ad);
 }
 
 static int selinux_msg_msg_alloc_security(struct msg_msg *msg)
@@ -5585,7 +5683,8 @@ static int selinux_msg_queue_alloc_security(struct msg_queue *msq)
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
-	rc = avc_has_perm(sid, isec->sid, SECCLASS_MSGQ,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid, SECCLASS_MSGQ,
 			  MSGQ__CREATE, &ad);
 	if (rc) {
 		ipc_free_security(&msq->q_perm);
@@ -5610,7 +5709,8 @@ static int selinux_msg_queue_associate(struct msg_queue *msq, int msqflg)
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
-	return avc_has_perm(sid, isec->sid, SECCLASS_MSGQ,
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, SECCLASS_MSGQ,
 			    MSGQ__ASSOCIATE, &ad);
 }
 
@@ -5623,7 +5723,8 @@ static int selinux_msg_queue_msgctl(struct msg_queue *msq, int cmd)
 	case IPC_INFO:
 	case MSG_INFO:
 		/* No specific object, just general system-wide information. */
-		return avc_has_perm(current_sid(), SECINITSID_KERNEL,
+		return avc_has_perm(current_selinux_ns,
+				    current_sid(), SECINITSID_KERNEL,
 				    SECCLASS_SYSTEM, SYSTEM__IPC_INFO, NULL);
 	case IPC_STAT:
 	case MSG_STAT:
@@ -5672,15 +5773,18 @@ static int selinux_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg,
 	ad.u.ipc_id = msq->q_perm.key;
 
 	/* Can this process write to the queue? */
-	rc = avc_has_perm(sid, isec->sid, SECCLASS_MSGQ,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid, SECCLASS_MSGQ,
 			  MSGQ__WRITE, &ad);
 	if (!rc)
 		/* Can this process send the message */
-		rc = avc_has_perm(sid, msec->sid, SECCLASS_MSG,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, msec->sid, SECCLASS_MSG,
 				  MSG__SEND, &ad);
 	if (!rc)
 		/* Can the message be put in the queue? */
-		rc = avc_has_perm(msec->sid, isec->sid, SECCLASS_MSGQ,
+		rc = avc_has_perm(current_selinux_ns,
+				  msec->sid, isec->sid, SECCLASS_MSGQ,
 				  MSGQ__ENQUEUE, &ad);
 
 	return rc;
@@ -5702,10 +5806,12 @@ static int selinux_msg_queue_msgrcv(struct msg_queue *msq, struct msg_msg *msg,
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
-	rc = avc_has_perm(sid, isec->sid,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid,
 			  SECCLASS_MSGQ, MSGQ__READ, &ad);
 	if (!rc)
-		rc = avc_has_perm(sid, msec->sid,
+		rc = avc_has_perm(current_selinux_ns,
+				  sid, msec->sid,
 				  SECCLASS_MSG, MSG__RECEIVE, &ad);
 	return rc;
 }
@@ -5727,7 +5833,8 @@ static int selinux_shm_alloc_security(struct shmid_kernel *shp)
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = shp->shm_perm.key;
 
-	rc = avc_has_perm(sid, isec->sid, SECCLASS_SHM,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid, SECCLASS_SHM,
 			  SHM__CREATE, &ad);
 	if (rc) {
 		ipc_free_security(&shp->shm_perm);
@@ -5752,7 +5859,8 @@ static int selinux_shm_associate(struct shmid_kernel *shp, int shmflg)
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = shp->shm_perm.key;
 
-	return avc_has_perm(sid, isec->sid, SECCLASS_SHM,
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, SECCLASS_SHM,
 			    SHM__ASSOCIATE, &ad);
 }
 
@@ -5766,7 +5874,8 @@ static int selinux_shm_shmctl(struct shmid_kernel *shp, int cmd)
 	case IPC_INFO:
 	case SHM_INFO:
 		/* No specific object, just general system-wide information. */
-		return avc_has_perm(current_sid(), SECINITSID_KERNEL,
+		return avc_has_perm(current_selinux_ns,
+				    current_sid(), SECINITSID_KERNEL,
 				    SECCLASS_SYSTEM, SYSTEM__IPC_INFO, NULL);
 	case IPC_STAT:
 	case SHM_STAT:
@@ -5820,7 +5929,8 @@ static int selinux_sem_alloc_security(struct sem_array *sma)
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = sma->sem_perm.key;
 
-	rc = avc_has_perm(sid, isec->sid, SECCLASS_SEM,
+	rc = avc_has_perm(current_selinux_ns,
+			  sid, isec->sid, SECCLASS_SEM,
 			  SEM__CREATE, &ad);
 	if (rc) {
 		ipc_free_security(&sma->sem_perm);
@@ -5845,7 +5955,8 @@ static int selinux_sem_associate(struct sem_array *sma, int semflg)
 	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = sma->sem_perm.key;
 
-	return avc_has_perm(sid, isec->sid, SECCLASS_SEM,
+	return avc_has_perm(current_selinux_ns,
+			    sid, isec->sid, SECCLASS_SEM,
 			    SEM__ASSOCIATE, &ad);
 }
 
@@ -5859,7 +5970,8 @@ static int selinux_sem_semctl(struct sem_array *sma, int cmd)
 	case IPC_INFO:
 	case SEM_INFO:
 		/* No specific object, just general system-wide information. */
-		return avc_has_perm(current_sid(), SECINITSID_KERNEL,
+		return avc_has_perm(current_selinux_ns,
+				    current_sid(), SECINITSID_KERNEL,
 				    SECCLASS_SYSTEM, SYSTEM__IPC_INFO, NULL);
 	case GETPID:
 	case GETNCNT:
@@ -5945,7 +6057,8 @@ static int selinux_getprocattr(struct task_struct *p,
 	__tsec = __task_cred(p)->security;
 
 	if (current != p) {
-		error = avc_has_perm(current_sid(), __tsec->sid,
+		error = avc_has_perm(current_selinux_ns,
+				     current_sid(), __tsec->sid,
 				     SECCLASS_PROCESS, PROCESS__GETATTR, NULL);
 		if (error)
 			goto bad;
@@ -5994,19 +6107,24 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 	 * Basic control over ability to set these attributes at all.
 	 */
 	if (!strcmp(name, "exec"))
-		error = avc_has_perm(mysid, mysid, SECCLASS_PROCESS,
+		error = avc_has_perm(current_selinux_ns,
+				     mysid, mysid, SECCLASS_PROCESS,
 				     PROCESS__SETEXEC, NULL);
 	else if (!strcmp(name, "fscreate"))
-		error = avc_has_perm(mysid, mysid, SECCLASS_PROCESS,
+		error = avc_has_perm(current_selinux_ns,
+				     mysid, mysid, SECCLASS_PROCESS,
 				     PROCESS__SETFSCREATE, NULL);
 	else if (!strcmp(name, "keycreate"))
-		error = avc_has_perm(mysid, mysid, SECCLASS_PROCESS,
+		error = avc_has_perm(current_selinux_ns,
+				     mysid, mysid, SECCLASS_PROCESS,
 				     PROCESS__SETKEYCREATE, NULL);
 	else if (!strcmp(name, "sockcreate"))
-		error = avc_has_perm(mysid, mysid, SECCLASS_PROCESS,
+		error = avc_has_perm(current_selinux_ns,
+				     mysid, mysid, SECCLASS_PROCESS,
 				     PROCESS__SETSOCKCREATE, NULL);
 	else if (!strcmp(name, "current"))
-		error = avc_has_perm(mysid, mysid, SECCLASS_PROCESS,
+		error = avc_has_perm(current_selinux_ns,
+				     mysid, mysid, SECCLASS_PROCESS,
 				     PROCESS__SETCURRENT, NULL);
 	else
 		error = -EINVAL;
@@ -6063,7 +6181,8 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 	} else if (!strcmp(name, "fscreate")) {
 		tsec->create_sid = sid;
 	} else if (!strcmp(name, "keycreate")) {
-		error = avc_has_perm(mysid, sid, SECCLASS_KEY, KEY__CREATE,
+		error = avc_has_perm(current_selinux_ns,
+				     mysid, sid, SECCLASS_KEY, KEY__CREATE,
 				     NULL);
 		if (error)
 			goto abort_change;
@@ -6085,7 +6204,8 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 		}
 
 		/* Check permissions for the transition. */
-		error = avc_has_perm(tsec->sid, sid, SECCLASS_PROCESS,
+		error = avc_has_perm(current_selinux_ns,
+				     tsec->sid, sid, SECCLASS_PROCESS,
 				     PROCESS__DYNTRANSITION, NULL);
 		if (error)
 			goto abort_change;
@@ -6094,7 +6214,8 @@ static int selinux_setprocattr(const char *name, void *value, size_t size)
 		   Otherwise, leave SID unchanged and fail. */
 		ptsid = ptrace_parent_sid();
 		if (ptsid != 0) {
-			error = avc_has_perm(ptsid, sid, SECCLASS_PROCESS,
+			error = avc_has_perm(current_selinux_ns,
+					     ptsid, sid, SECCLASS_PROCESS,
 					     PROCESS__PTRACE, NULL);
 			if (error)
 				goto abort_change;
@@ -6220,7 +6341,8 @@ static int selinux_key_permission(key_ref_t key_ref,
 	key = key_ref_to_ptr(key_ref);
 	ksec = key->security;
 
-	return avc_has_perm(sid, ksec->sid, SECCLASS_KEY, perm, NULL);
+	return avc_has_perm(current_selinux_ns,
+			    sid, ksec->sid, SECCLASS_KEY, perm, NULL);
 }
 
 static int selinux_key_getsecurity(struct key *key, char **_buffer)
@@ -6256,7 +6378,8 @@ static int selinux_ib_pkey_access(void *ib_sec, u64 subnet_prefix, u16 pkey_val)
 	ibpkey.subnet_prefix = subnet_prefix;
 	ibpkey.pkey = pkey_val;
 	ad.u.ibpkey = &ibpkey;
-	return avc_has_perm(sec->sid, sid,
+	return avc_has_perm(current_selinux_ns,
+			    sec->sid, sid,
 			    SECCLASS_INFINIBAND_PKEY,
 			    INFINIBAND_PKEY__ACCESS, &ad);
 }
@@ -6280,7 +6403,8 @@ static int selinux_ib_endport_manage_subnet(void *ib_sec, const char *dev_name,
 	strncpy(ibendport.dev_name, dev_name, sizeof(ibendport.dev_name));
 	ibendport.port = port_num;
 	ad.u.ibendport = &ibendport;
-	return avc_has_perm(sec->sid, sid,
+	return avc_has_perm(current_selinux_ns,
+			    sec->sid, sid,
 			    SECCLASS_INFINIBAND_ENDPORT,
 			    INFINIBAND_ENDPORT__MANAGE_SUBNET, &ad);
 }
@@ -6543,12 +6667,17 @@ int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns)
 	if (rc)
 		goto err;
 
+	rc = selinux_avc_create(&newns->avc);
+	if (rc)
+		goto err;
+
 	if (parent)
 		newns->parent = get_selinux_ns(parent);
 
 	*ns = newns;
 	return 0;
 err:
+	selinux_ss_free(newns->ss);
 	kfree(newns);
 	return rc;
 }
@@ -6561,6 +6690,7 @@ static void selinux_ns_free(struct work_struct *work)
 	do {
 		parent = ns->parent;
 		selinux_ss_free(ns->ss);
+		selinux_avc_free(ns->avc);
 		kfree(ns);
 		ns = parent;
 	} while (ns && refcount_dec_and_test(&ns->count));
diff --git a/security/selinux/include/avc.h b/security/selinux/include/avc.h
index 8fd09f7..cdcd755 100644
--- a/security/selinux/include/avc.h
+++ b/security/selinux/include/avc.h
@@ -51,6 +51,7 @@ struct selinux_audit_data {
 	u32 audited;
 	u32 denied;
 	int result;
+	struct selinux_ns *ns;
 };
 
 /*
@@ -95,7 +96,8 @@ static inline u32 avc_audit_required(u32 requested,
 	return audited;
 }
 
-int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
+int slow_avc_audit(struct selinux_ns *ns,
+		   u32 ssid, u32 tsid, u16 tclass,
 		   u32 requested, u32 audited, u32 denied, int result,
 		   struct common_audit_data *a,
 		   unsigned flags);
@@ -120,7 +122,8 @@ int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
  * be performed under a lock, to allow the lock to be released
  * before calling the auditing code.
  */
-static inline int avc_audit(u32 ssid, u32 tsid,
+static inline int avc_audit(struct selinux_ns *ns,
+			    u32 ssid, u32 tsid,
 			    u16 tclass, u32 requested,
 			    struct av_decision *avd,
 			    int result,
@@ -131,31 +134,35 @@ static inline int avc_audit(u32 ssid, u32 tsid,
 	audited = avc_audit_required(requested, avd, result, 0, &denied);
 	if (likely(!audited))
 		return 0;
-	return slow_avc_audit(ssid, tsid, tclass,
+	return slow_avc_audit(ns, ssid, tsid, tclass,
 			      requested, audited, denied, result,
 			      a, flags);
 }
 
 #define AVC_STRICT 1 /* Ignore permissive mode. */
 #define AVC_EXTENDED_PERMS 2	/* update extended permissions */
-int avc_has_perm_noaudit(u32 ssid, u32 tsid,
+int avc_has_perm_noaudit(struct selinux_ns *ns,
+			 u32 ssid, u32 tsid,
 			 u16 tclass, u32 requested,
 			 unsigned flags,
 			 struct av_decision *avd);
 
-int avc_has_perm(u32 ssid, u32 tsid,
+int avc_has_perm(struct selinux_ns *ns,
+		 u32 ssid, u32 tsid,
 		 u16 tclass, u32 requested,
 		 struct common_audit_data *auditdata);
-int avc_has_perm_flags(u32 ssid, u32 tsid,
+int avc_has_perm_flags(struct selinux_ns *ns,
+		       u32 ssid, u32 tsid,
 		       u16 tclass, u32 requested,
 		       struct common_audit_data *auditdata,
 		       int flags);
 
-int avc_has_extended_perms(u32 ssid, u32 tsid, u16 tclass, u32 requested,
-		u8 driver, u8 perm, struct common_audit_data *ad);
+int avc_has_extended_perms(struct selinux_ns *ns,
+			   u32 ssid, u32 tsid, u16 tclass, u32 requested,
+			   u8 driver, u8 perm, struct common_audit_data *ad);
 
 
-u32 avc_policy_seqno(void);
+u32 avc_policy_seqno(struct selinux_ns *ns);
 
 #define AVC_CALLBACK_GRANT		1
 #define AVC_CALLBACK_TRY_REVOKE		2
@@ -170,8 +177,11 @@ u32 avc_policy_seqno(void);
 int avc_add_callback(int (*callback)(u32 event), u32 events);
 
 /* Exported to selinuxfs */
-int avc_get_hash_stats(char *page);
-extern unsigned int avc_cache_threshold;
+struct selinux_avc;
+int avc_get_hash_stats(struct selinux_avc *avc, char *page);
+unsigned int avc_get_cache_threshold(struct selinux_avc *avc);
+void avc_set_cache_threshold(struct selinux_avc *avc,
+			     unsigned int cache_threshold);
 
 /* Attempt to free avc node cache */
 void avc_disable(void);
diff --git a/security/selinux/include/avc_ss.h b/security/selinux/include/avc_ss.h
index 7fef2fd..e12d35a 100644
--- a/security/selinux/include/avc_ss.h
+++ b/security/selinux/include/avc_ss.h
@@ -8,7 +8,8 @@
 
 #include "flask.h"
 
-int avc_ss_reset(u32 seqno);
+struct selinux_avc;
+int avc_ss_reset(struct selinux_avc *avc, u32 seqno);
 
 /* Class/perm mapping support */
 struct security_class_mapping {
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 429e6f7..77d977c 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -92,6 +92,7 @@ extern char *selinux_policycap_names[__POLICYDB_CAPABILITY_MAX];
 /* limitation of boundary depth  */
 #define POLICYDB_BOUNDS_MAXDEPTH	4
 
+struct selinux_avc;
 struct selinux_ss;
 
 struct selinux_ns {
@@ -104,6 +105,7 @@ struct selinux_ns {
 	bool checkreqprot;
 	bool initialized;
 	bool policycap[__POLICYDB_CAPABILITY_MAX];
+	struct selinux_avc *avc;
 	struct selinux_ss *ss;
 	struct selinux_ns *parent;
 };
@@ -114,6 +116,9 @@ void __put_selinux_ns(struct selinux_ns *ns);
 int selinux_ss_create(struct selinux_ss **ss);
 void selinux_ss_free(struct selinux_ss *ss);
 
+int selinux_avc_create(struct selinux_avc **avc);
+void selinux_avc_free(struct selinux_avc *avc);
+
 static inline void put_selinux_ns(struct selinux_ns *ns)
 {
 	if (ns && refcount_dec_and_test(&ns->count))
diff --git a/security/selinux/netlabel.c b/security/selinux/netlabel.c
index b75ceaa..4931b92 100644
--- a/security/selinux/netlabel.c
+++ b/security/selinux/netlabel.c
@@ -406,7 +406,8 @@ int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec,
 		perm = RAWIP_SOCKET__RECVFROM;
 	}
 
-	rc = avc_has_perm(sksec->sid, nlbl_sid, sksec->sclass, perm, ad);
+	rc = avc_has_perm(current_selinux_ns,
+			  sksec->sid, nlbl_sid, sksec->sclass, perm, ad);
 	if (rc == 0)
 		return 0;
 
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index e29d60e..90424454 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -161,7 +161,8 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 	new_value = !!new_value;
 
 	if (new_value != ns_enforcing(ns)) {
-		length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+		length = avc_has_perm(current_selinux_ns,
+				      current_sid(), SECINITSID_SECURITY,
 				      SECCLASS_SECURITY, SECURITY__SETENFORCE,
 				      NULL);
 		if (length)
@@ -173,7 +174,7 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 			audit_get_sessionid(current));
 		ns_enforcing(ns) = new_value;
 		if (ns_enforcing(ns))
-			avc_ss_reset(0);
+			avc_ss_reset(ns->avc, 0);
 		selnl_notify_setenforce(ns_enforcing(ns));
 		selinux_status_update_setenforce(ns);
 		if (!ns_enforcing(ns))
@@ -378,7 +379,8 @@ static int sel_open_policy(struct inode *inode, struct file *filp)
 
 	mutex_lock(&fsi->mutex);
 
-	rc = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	rc = avc_has_perm(current_selinux_ns,
+			  current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL);
 	if (rc)
 		goto err;
@@ -442,7 +444,8 @@ static ssize_t sel_read_policy(struct file *filp, char __user *buf,
 
 	mutex_lock(&fsi->mutex);
 
-	ret = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	ret = avc_has_perm(current_selinux_ns,
+			   current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__READ_POLICY, NULL);
 	if (ret)
 		goto out;
@@ -539,7 +542,8 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf,
 
 	mutex_lock(&fsi->mutex);
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__LOAD_POLICY, NULL);
 	if (length)
 		goto out;
@@ -598,7 +602,8 @@ static ssize_t sel_write_context(struct file *file, char *buf, size_t size)
 	u32 sid, len;
 	ssize_t length;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__CHECK_CONTEXT, NULL);
 	if (length)
 		goto out;
@@ -646,7 +651,8 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 	ssize_t length;
 	unsigned int new_value;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETCHECKREQPROT,
 			      NULL);
 	if (length)
@@ -691,7 +697,8 @@ static ssize_t sel_write_validatetrans(struct file *file,
 	u16 tclass;
 	int rc;
 
-	rc = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	rc = avc_has_perm(current_selinux_ns,
+			  current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__VALIDATE_TRANS, NULL);
 	if (rc)
 		goto out;
@@ -819,7 +826,8 @@ static ssize_t sel_write_access(struct file *file, char *buf, size_t size)
 	struct av_decision avd;
 	ssize_t length;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__COMPUTE_AV, NULL);
 	if (length)
 		goto out;
@@ -872,7 +880,8 @@ static ssize_t sel_write_create(struct file *file, char *buf, size_t size)
 	u32 len;
 	int nargs;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__COMPUTE_CREATE,
 			      NULL);
 	if (length)
@@ -973,7 +982,8 @@ static ssize_t sel_write_relabel(struct file *file, char *buf, size_t size)
 	char *newcon = NULL;
 	u32 len;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__COMPUTE_RELABEL,
 			      NULL);
 	if (length)
@@ -1033,7 +1043,8 @@ static ssize_t sel_write_user(struct file *file, char *buf, size_t size)
 	int i, rc;
 	u32 len, nsids;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__COMPUTE_USER,
 			      NULL);
 	if (length)
@@ -1097,7 +1108,8 @@ static ssize_t sel_write_member(struct file *file, char *buf, size_t size)
 	char *newcon = NULL;
 	u32 len;
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__COMPUTE_MEMBER,
 			      NULL);
 	if (length)
@@ -1210,7 +1222,8 @@ static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
 
 	mutex_lock(&fsi->mutex);
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETBOOL,
 			      NULL);
 	if (length)
@@ -1271,7 +1284,8 @@ static ssize_t sel_commit_bools_write(struct file *filep,
 
 	mutex_lock(&fsi->mutex);
 
-	length = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	length = avc_has_perm(current_selinux_ns,
+			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETBOOL,
 			      NULL);
 	if (length)
@@ -1411,10 +1425,13 @@ static int sel_make_bools(struct selinux_fs_info *fsi)
 static ssize_t sel_read_avc_cache_threshold(struct file *filp, char __user *buf,
 					    size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char tmpbuf[TMPBUFLEN];
 	ssize_t length;
 
-	length = scnprintf(tmpbuf, TMPBUFLEN, "%u", avc_cache_threshold);
+	length = scnprintf(tmpbuf, TMPBUFLEN, "%u",
+			   avc_get_cache_threshold(ns->avc));
 	return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
 }
 
@@ -1423,11 +1440,14 @@ static ssize_t sel_write_avc_cache_threshold(struct file *file,
 					     size_t count, loff_t *ppos)
 
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page;
 	ssize_t ret;
 	unsigned int new_value;
 
-	ret = avc_has_perm(current_sid(), SECINITSID_SECURITY,
+	ret = avc_has_perm(current_selinux_ns,
+			   current_sid(), SECINITSID_SECURITY,
 			   SECCLASS_SECURITY, SECURITY__SETSECPARAM,
 			   NULL);
 	if (ret)
@@ -1448,7 +1468,7 @@ static ssize_t sel_write_avc_cache_threshold(struct file *file,
 	if (sscanf(page, "%u", &new_value) != 1)
 		goto out;
 
-	avc_cache_threshold = new_value;
+	avc_set_cache_threshold(ns->avc, new_value);
 
 	ret = count;
 out:
@@ -1459,6 +1479,8 @@ static ssize_t sel_write_avc_cache_threshold(struct file *file,
 static ssize_t sel_read_avc_hash_stats(struct file *filp, char __user *buf,
 				       size_t count, loff_t *ppos)
 {
+	struct selinux_fs_info *fsi = file_inode(filp)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page;
 	ssize_t length;
 
@@ -1466,7 +1488,7 @@ static ssize_t sel_read_avc_hash_stats(struct file *filp, char __user *buf,
 	if (!page)
 		return -ENOMEM;
 
-	length = avc_get_hash_stats(page);
+	length = avc_get_hash_stats(ns->avc, page);
 	if (length >= 0)
 		length = simple_read_from_buffer(buf, count, ppos, page, length);
 	free_page((unsigned long)page);
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 1e202b0..e1c3881 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -2163,7 +2163,7 @@ int security_load_policy(struct selinux_ns *ns, void *data, size_t len)
 		ns->initialized = 1;
 		seqno = ++ns->ss->latest_granting;
 		selinux_complete_init();
-		avc_ss_reset(seqno);
+		avc_ss_reset(ns->avc, seqno);
 		selnl_notify_policyload(seqno);
 		selinux_status_update_policyload(ns, seqno);
 		selinux_netlbl_cache_invalidate();
@@ -2245,7 +2245,7 @@ int security_load_policy(struct selinux_ns *ns, void *data, size_t len)
 	sidtab_destroy(&oldsidtab);
 	kfree(oldmapping);
 
-	avc_ss_reset(seqno);
+	avc_ss_reset(ns->avc, seqno);
 	selnl_notify_policyload(seqno);
 	selinux_status_update_policyload(ns, seqno);
 	selinux_netlbl_cache_invalidate();
@@ -2661,7 +2661,8 @@ int security_get_user_sids(struct selinux_ns *ns,
 	}
 	for (i = 0, j = 0; i < mynel; i++) {
 		struct av_decision dummy_avd;
-		rc = avc_has_perm_noaudit(fromsid, mysids[i],
+		rc = avc_has_perm_noaudit(ns,
+					  fromsid, mysids[i],
 					  SECCLASS_PROCESS, /* kernel value */
 					  PROCESS__TRANSITION, AVC_STRICT,
 					  &dummy_avd);
@@ -2919,7 +2920,7 @@ int security_set_bools(struct selinux_ns *ns, int len, int *values)
 out:
 	write_unlock_irq(&ns->ss->policy_rwlock);
 	if (!rc) {
-		avc_ss_reset(seqno);
+		avc_ss_reset(ns->avc, seqno);
 		selnl_notify_policyload(seqno);
 		selinux_status_update_policyload(ns, seqno);
 		selinux_xfrm_notify_policyload();
diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c
index 410f19a..2f00932 100644
--- a/security/selinux/xfrm.c
+++ b/security/selinux/xfrm.c
@@ -106,7 +106,8 @@ static int selinux_xfrm_alloc_user(struct xfrm_sec_ctx **ctxp,
 	if (rc)
 		goto err;
 
-	rc = avc_has_perm(tsec->sid, ctx->ctx_sid,
+	rc = avc_has_perm(current_selinux_ns,
+			  tsec->sid, ctx->ctx_sid,
 			  SECCLASS_ASSOCIATION, ASSOCIATION__SETCONTEXT, NULL);
 	if (rc)
 		goto err;
@@ -142,7 +143,8 @@ static int selinux_xfrm_delete(struct xfrm_sec_ctx *ctx)
 	if (!ctx)
 		return 0;
 
-	return avc_has_perm(tsec->sid, ctx->ctx_sid,
+	return avc_has_perm(current_selinux_ns,
+			    tsec->sid, ctx->ctx_sid,
 			    SECCLASS_ASSOCIATION, ASSOCIATION__SETCONTEXT,
 			    NULL);
 }
@@ -164,7 +166,8 @@ int selinux_xfrm_policy_lookup(struct xfrm_sec_ctx *ctx, u32 fl_secid, u8 dir)
 	if (!selinux_authorizable_ctx(ctx))
 		return -EINVAL;
 
-	rc = avc_has_perm(fl_secid, ctx->ctx_sid,
+	rc = avc_has_perm(current_selinux_ns,
+			  fl_secid, ctx->ctx_sid,
 			  SECCLASS_ASSOCIATION, ASSOCIATION__POLMATCH, NULL);
 	return (rc == -EACCES ? -ESRCH : rc);
 }
@@ -203,7 +206,8 @@ int selinux_xfrm_state_pol_flow_match(struct xfrm_state *x,
 	/* We don't need a separate SA Vs. policy polmatch check since the SA
 	 * is now of the same label as the flow and a flow Vs. policy polmatch
 	 * check had already happened in selinux_xfrm_policy_lookup() above. */
-	return (avc_has_perm(fl->flowi_secid, state_sid,
+	return (avc_has_perm(current_selinux_ns,
+			     fl->flowi_secid, state_sid,
 			    SECCLASS_ASSOCIATION, ASSOCIATION__SENDTO,
 			    NULL) ? 0 : 1);
 }
@@ -422,7 +426,8 @@ int selinux_xfrm_sock_rcv_skb(u32 sk_sid, struct sk_buff *skb,
 	/* This check even when there's no association involved is intended,
 	 * according to Trent Jaeger, to make sure a process can't engage in
 	 * non-IPsec communication unless explicitly allowed by policy. */
-	return avc_has_perm(sk_sid, peer_sid,
+	return avc_has_perm(current_selinux_ns,
+			    sk_sid, peer_sid,
 			    SECCLASS_ASSOCIATION, ASSOCIATION__RECVFROM, ad);
 }
 
@@ -465,6 +470,6 @@ int selinux_xfrm_postroute_last(u32 sk_sid, struct sk_buff *skb,
 	/* This check even when there's no association involved is intended,
 	 * according to Trent Jaeger, to make sure a process can't engage in
 	 * non-IPsec communication unless explicitly allowed by policy. */
-	return avc_has_perm(sk_sid, SECINITSID_UNLABELED,
+	return avc_has_perm(current_selinux_ns, sk_sid, SECINITSID_UNLABELED,
 			    SECCLASS_ASSOCIATION, ASSOCIATION__SENDTO, ad);
 }
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (2 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 03/10] selinux: move the AVC into the selinux namespace Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-05  5:47   ` Serge E. Hallyn
  2017-10-06  1:07   ` James Morris
  2017-10-02 15:58 ` [RFC 05/10] selinux: support per-task/cred selinux namespace Stephen Smalley
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

The selinux netlink socket is used to notify userspace of changes to
the enforcing mode and policy reloads.  At present, these notifications
are always sent to the initial network namespace.  In order to support
multiple selinux namespaces, each with its own enforcing mode and
policy, we need to create and use a separate selinux netlink socket
for each network namespace.

Without this change, a policy reload in a child selinux namespace
causes a notification to be sent to processes in the init namespace
with a sequence number that may be higher than the policy sequence
number for that namespace.  As a result, userspace AVC instances in
the init namespace will then end up rejecting any further access
vector results from its own security server instance due to the
policy sequence number appearing to regress, which in turn causes
all subsequent uncached access checks to fail.  Similarly,
without this change, changing enforcing mode in the child selinux
namespace triggers a notification to all userspace AVC instances
in the init namespace that will switch their enforcing modes.

This change does alter SELinux behavior, since previously reloading
policy or changing enforcing mode in a non-init network namespace would
trigger a notification to processes in the init network namespace.
However, this behavior is not being relied upon by existing userspace
AFAICT and is arguably wrong regardless.

This change presumes that one will always unshare the network namespace
when unsharing a new selinux namespace (the reverse is not required).
Otherwise, the same inconsistencies could arise between the notifications
and the relevant policy.  At present, nothing enforces this guarantee
at the kernel level; it is left up to userspace (e.g. container runtimes).
It is an open question as to whether this is a good idea or whether
unsharing of the selinux namespace should automatically unshare the network
namespace.  However, keeping them separate is consistent with the handling
of the mount namespace currently, which also should be unshared so that
a private selinuxfs mount can be created.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 include/net/net_namespace.h |  3 +++
 security/selinux/netlink.c  | 31 +++++++++++++++++++++++++------
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 57faa37..e4dd04a 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -149,6 +149,9 @@ struct net {
 #endif
 	struct sock		*diag_nlsk;
 	atomic_t		fnhe_genid;
+#if IS_ENABLED(CONFIG_SECURITY_SELINUX)
+	struct sock		*selnl;
+#endif
 } __randomize_layout;
 
 #include <linux/seq_file_net.h>
diff --git a/security/selinux/netlink.c b/security/selinux/netlink.c
index 828fb6a..679616b 100644
--- a/security/selinux/netlink.c
+++ b/security/selinux/netlink.c
@@ -22,8 +22,6 @@
 
 #include "security.h"
 
-static struct sock *selnl;
-
 static int selnl_msglen(int msgtype)
 {
 	int ret = 0;
@@ -69,6 +67,7 @@ static void selnl_add_payload(struct nlmsghdr *nlh, int len, int msgtype, void *
 
 static void selnl_notify(int msgtype, void *data)
 {
+	struct sock *selnl = current->nsproxy->net_ns->selnl;
 	int len;
 	sk_buff_data_t tmp;
 	struct sk_buff *skb;
@@ -108,16 +107,36 @@ void selnl_notify_policyload(u32 seqno)
 	selnl_notify(SELNL_MSG_POLICYLOAD, &seqno);
 }
 
-static int __init selnl_init(void)
+static int __net_init selnl_net_init(struct net *net)
 {
+	struct sock *sk;
 	struct netlink_kernel_cfg cfg = {
 		.groups	= SELNLGRP_MAX,
 		.flags	= NL_CFG_F_NONROOT_RECV,
 	};
 
-	selnl = netlink_kernel_create(&init_net, NETLINK_SELINUX, &cfg);
-	if (selnl == NULL)
-		panic("SELinux:  Cannot create netlink socket.");
+	sk = netlink_kernel_create(net, NETLINK_SELINUX, &cfg);
+	if (!sk)
+		return -ENOMEM;
+	net->selnl = sk;
+	return 0;
+}
+
+static void __net_exit selnl_net_exit(struct net *net)
+{
+	netlink_kernel_release(net->selnl);
+	net->selnl = NULL;
+}
+
+static struct pernet_operations selnl_net_ops = {
+	.init = selnl_net_init,
+	.exit = selnl_net_exit,
+};
+
+static int __init selnl_init(void)
+{
+	if (register_pernet_subsys(&selnl_net_ops))
+		panic("Could not register selinux netlink operations\n");
 	return 0;
 }
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 05/10] selinux: support per-task/cred selinux namespace
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (3 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-06  1:14   ` James Morris
  2017-10-02 15:58 ` [RFC 06/10] selinux: introduce cred_selinux_ns() and use it Stephen Smalley
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Extend the task security structure to include a reference to
the associated selinux namespace, and to also contain a
pointer to the cred in the parent namespace.  The current selinux
namespace is changed to the per-task/cred selinux namespace
for the current task/cred.

This change makes it possible to support per-cred selinux namespaces,
but does not yet introduce a mechanism for unsharing of the selinux
namespace.  Thus, by itself, this change does not alter the existing
situation with respect to all processes still using a single init
selinux namespace.

An alternative would be to hang the selinux namespace off of the
user namespace, which itself is associated with the cred.  This
seems undesirable however since DAC and MAC are orthogonal, and
there appear to be real use cases where one will want to use selinux
namespaces without user namespaces and vice versa. However, one
advantage of hanging off the user namespace would be that it is already
associated with other namespaces, such as the network namespace, thus
potentially facilitating looking up the relevant selinux namespace from
the network input/forward hooks.  In most cases however, it appears that
we could instead copy a reference to the creating task selinux namespace
to sock security structures and use that in those hooks.

Introduce a task_security() helper to obtain the correct task/cred
security structure from the hooks, and update the hooks to use it.
This returns a pointer to the security structure for the task in
the same selinux namespace as the caller, or if there is none, a
fake security structure with the well-defined unlabeled SIDs.  This
ensures that we return a valid result that can be used for permission
checks and for returning contexts from e.g. reading /proc/pid/attr files.

I am not signing off on this or most subsequent patches as I am not
yet convinced that this is the right approach.

Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c            | 34 ++++++++++++++++++++++++++++++++--
 security/selinux/include/objsec.h   |  9 ---------
 security/selinux/include/security.h | 13 ++++++++++++-
 3 files changed, 44 insertions(+), 12 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 25f5147..f9c1e2c 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -213,6 +213,7 @@ static void cred_init_security(void)
 		panic("SELinux:  Failed to initialize initial task.\n");
 
 	tsec->osid = tsec->sid = SECINITSID_KERNEL;
+	tsec->ns = get_selinux_ns(init_selinux_ns);
 	cred->security = tsec;
 }
 
@@ -227,15 +228,35 @@ static inline u32 cred_sid(const struct cred *cred)
 	return tsec->sid;
 }
 
+static struct task_security_struct unlabeled_task_security = {
+	.osid = SECINITSID_UNLABELED,
+	.sid = SECINITSID_UNLABELED,
+};
+
+static const struct task_security_struct *task_security(
+	const struct task_struct *p)
+{
+	const struct task_security_struct *tsec;
+
+	tsec = __task_cred(p)->security;
+	while (tsec->ns != current_selinux_ns && tsec->parent_cred)
+		tsec = tsec->parent_cred->security;
+	if (tsec->ns != current_selinux_ns)
+		return &unlabeled_task_security;
+	return tsec;
+}
+
 /*
  * get the objective security ID of a task
  */
 static inline u32 task_sid(const struct task_struct *task)
 {
+	const struct task_security_struct *tsec;
 	u32 sid;
 
 	rcu_read_lock();
-	sid = cred_sid(__task_cred(task));
+	tsec = task_security(task);
+	sid = tsec->sid;
 	rcu_read_unlock();
 	return sid;
 }
@@ -3900,6 +3921,9 @@ static void selinux_cred_free(struct cred *cred)
 	 */
 	BUG_ON(cred->security && (unsigned long) cred->security < PAGE_SIZE);
 	cred->security = (void *) 0x7UL;
+	put_selinux_ns(tsec->ns);
+	if (tsec->parent_cred)
+		put_cred(tsec->parent_cred);
 	kfree(tsec);
 }
 
@@ -3918,6 +3942,9 @@ static int selinux_cred_prepare(struct cred *new, const struct cred *old,
 	if (!tsec)
 		return -ENOMEM;
 
+	tsec->ns = get_selinux_ns(old_tsec->ns);
+	if (old_tsec->parent_cred)
+		tsec->parent_cred = get_cred(old_tsec->parent_cred);
 	new->security = tsec;
 	return 0;
 }
@@ -3931,6 +3958,9 @@ static void selinux_cred_transfer(struct cred *new, const struct cred *old)
 	struct task_security_struct *tsec = new->security;
 
 	*tsec = *old_tsec;
+	tsec->ns = get_selinux_ns(old_tsec->ns);
+	if (old_tsec->parent_cred)
+		tsec->parent_cred = get_cred(old_tsec->parent_cred);
 }
 
 /*
@@ -6054,7 +6084,7 @@ static int selinux_getprocattr(struct task_struct *p,
 	unsigned len;
 
 	rcu_read_lock();
-	__tsec = __task_cred(p)->security;
+	__tsec = task_security(p);
 
 	if (current != p) {
 		error = avc_has_perm(current_selinux_ns,
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 42d2dbb..051b804 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -29,15 +29,6 @@
 #include "flask.h"
 #include "avc.h"
 
-struct task_security_struct {
-	u32 osid;		/* SID prior to last execve */
-	u32 sid;		/* current SID */
-	u32 exec_sid;		/* exec SID */
-	u32 create_sid;		/* fscreate SID */
-	u32 keycreate_sid;	/* keycreate SID */
-	u32 sockcreate_sid;	/* fscreate SID */
-};
-
 /*
  * get the subjective security ID of the current task
  */
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 77d977c..246f9de 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -133,7 +133,18 @@ static inline struct selinux_ns *get_selinux_ns(struct selinux_ns *ns)
 
 extern struct selinux_ns *init_selinux_ns;
 
-#define current_selinux_ns (init_selinux_ns)
+struct task_security_struct {
+	u32 osid;		/* SID prior to last execve */
+	u32 sid;		/* current SID */
+	u32 exec_sid;		/* exec SID */
+	u32 create_sid;		/* fscreate SID */
+	u32 keycreate_sid;	/* keycreate SID */
+	u32 sockcreate_sid;	/* fscreate SID */
+	struct selinux_ns *ns;  /* selinux namespace */
+	const struct cred *parent_cred; /* cred in parent ns */
+};
+
+#define current_selinux_ns (((struct task_security_struct *)current_security())->ns)
 
 #define ss_initialized (current_selinux_ns->initialized)
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 06/10] selinux: introduce cred_selinux_ns() and use it
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (4 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 05/10] selinux: support per-task/cred selinux namespace Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-02 15:58 ` [RFC 07/10] selinux: support per-namespace inode security structures Stephen Smalley
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

When using the SID from a cred, we should pass the selinux
namespace associated with the cred on security server calls
rather than the current selinux namespace, since they could differ.
In some of these cases, the cred is always obtained from the current
task so there is no real change, but this is cleaner and hopefully
less fragile. In other cases, the cred could in fact differ.

Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c            | 42 ++++++++++++++++++-------------------
 security/selinux/include/security.h |  2 ++
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index f9c1e2c..efe8083 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -483,13 +483,13 @@ static int may_context_mount_sb_relabel(u32 sid,
 	const struct task_security_struct *tsec = cred->security;
 	int rc;
 
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELFROM, NULL);
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  tsec->sid, sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELTO, NULL);
 	return rc;
@@ -501,13 +501,13 @@ static int may_context_mount_inode_relabel(u32 sid,
 {
 	const struct task_security_struct *tsec = cred->security;
 	int rc;
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELFROM, NULL);
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__ASSOCIATE, NULL);
 	return rc;
@@ -1793,10 +1793,10 @@ static int cred_has_capability(const struct cred *cred,
 		return -EINVAL;
 	}
 
-	rc = avc_has_perm_noaudit(current_selinux_ns,
+	rc = avc_has_perm_noaudit(cred_selinux_ns(cred),
 				  sid, sid, sclass, av, 0, &avd);
 	if (audit == SECURITY_CAP_AUDIT) {
-		int rc2 = avc_audit(current_selinux_ns,
+		int rc2 = avc_audit(cred_selinux_ns(cred),
 				    sid, sid, sclass, av, &avd, rc, &ad, 0);
 		if (rc2)
 			return rc2;
@@ -1823,7 +1823,7 @@ static int inode_has_perm(const struct cred *cred,
 	sid = cred_sid(cred);
 	isec = inode->i_security;
 
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, isec->sid, isec->sclass, perms, adp);
 }
 
@@ -1893,7 +1893,7 @@ static int file_has_perm(const struct cred *cred,
 	ad.u.file = file;
 
 	if (sid != fsec->sid) {
-		rc = avc_has_perm(current_selinux_ns,
+		rc = avc_has_perm(cred_selinux_ns(cred),
 				  sid, fsec->sid,
 				  SECCLASS_FD,
 				  FD__USE,
@@ -2101,7 +2101,7 @@ static int superblock_has_perm(const struct cred *cred,
 	u32 sid = cred_sid(cred);
 
 	sbsec = sb->s_security;
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
 }
 
@@ -2282,7 +2282,7 @@ static int selinux_capset(struct cred *new, const struct cred *old,
 			  const kernel_cap_t *inheritable,
 			  const kernel_cap_t *permitted)
 {
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(old),
 			    cred_sid(old), cred_sid(new), SECCLASS_PROCESS,
 			    PROCESS__SETCAP, NULL);
 }
@@ -3122,7 +3122,7 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct inode *inode,
 	if (IS_ERR(isec))
 		return PTR_ERR(isec);
 
-	return avc_has_perm_flags(current_selinux_ns,
+	return avc_has_perm_flags(cred_selinux_ns(cred),
 				  sid, isec->sid, isec->sclass, FILE__READ, &ad,
 				  rcu ? MAY_NOT_BLOCK : 0);
 }
@@ -3178,7 +3178,7 @@ static int selinux_inode_permission(struct inode *inode, int mask)
 	if (IS_ERR(isec))
 		return PTR_ERR(isec);
 
-	rc = avc_has_perm_noaudit(current_selinux_ns,
+	rc = avc_has_perm_noaudit(cred_selinux_ns(cred),
 				  sid, isec->sid, isec->sclass, perms, 0, &avd);
 	audited = avc_audit_required(perms, &avd, rc,
 				     from_access ? FILE__AUDIT_ACCESS : 0,
@@ -3584,7 +3584,7 @@ static int ioctl_has_perm(const struct cred *cred, struct file *file,
 	ad.u.op->path = file->f_path;
 
 	if (ssid != fsec->sid) {
-		rc = avc_has_perm(current_selinux_ns,
+		rc = avc_has_perm(cred_selinux_ns(cred),
 				  ssid, fsec->sid,
 				SECCLASS_FD,
 				FD__USE,
@@ -3667,7 +3667,7 @@ static int file_map_prot_check(struct file *file, unsigned long prot, int shared
 		 * private file mapping that will also be writable.
 		 * This has an additional check.
 		 */
-		rc = avc_has_perm(current_selinux_ns,
+		rc = avc_has_perm(cred_selinux_ns(cred),
 				  sid, sid, SECCLASS_PROCESS,
 				  PROCESS__EXECMEM, NULL);
 		if (rc)
@@ -3743,14 +3743,14 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 		int rc = 0;
 		if (vma->vm_start >= vma->vm_mm->start_brk &&
 		    vma->vm_end <= vma->vm_mm->brk) {
-			rc = avc_has_perm(current_selinux_ns,
+			rc = avc_has_perm(cred_selinux_ns(cred),
 					  sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECHEAP, NULL);
 		} else if (!vma->vm_file &&
 			   ((vma->vm_start <= vma->vm_mm->start_stack &&
 			     vma->vm_end >= vma->vm_mm->start_stack) ||
 			    vma_is_stack_for_current(vma))) {
-			rc = avc_has_perm(current_selinux_ns,
+			rc = avc_has_perm(cred_selinux_ns(cred),
 					  sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECSTACK, NULL);
 		} else if (vma->vm_file && vma->anon_vma) {
@@ -3870,7 +3870,7 @@ static int selinux_file_open(struct file *file, const struct cred *cred)
 	 * struct as its SID.
 	 */
 	fsec->isid = isec->sid;
-	fsec->pseqno = avc_policy_seqno(current_selinux_ns);
+	fsec->pseqno = avc_policy_seqno(cred_selinux_ns(cred));
 	/*
 	 * Since the inode label or policy seqno may have changed
 	 * between the selinux_inode_permission check and the saving
@@ -3973,7 +3973,7 @@ static int selinux_kernel_act_as(struct cred *new, u32 secid)
 	u32 sid = current_sid();
 	int ret;
 
-	ret = avc_has_perm(current_selinux_ns,
+	ret = avc_has_perm(tsec->ns,
 			   sid, secid,
 			   SECCLASS_KERNEL_SERVICE,
 			   KERNEL_SERVICE__USE_AS_OVERRIDE,
@@ -3998,7 +3998,7 @@ static int selinux_kernel_create_files_as(struct cred *new, struct inode *inode)
 	u32 sid = current_sid();
 	int ret;
 
-	ret = avc_has_perm(current_selinux_ns,
+	ret = avc_has_perm(tsec->ns,
 			   sid, isec->sid,
 			   SECCLASS_KERNEL_SERVICE,
 			   KERNEL_SERVICE__CREATE_FILES_AS,
@@ -4128,7 +4128,7 @@ int selinux_task_prlimit(const struct cred *cred, const struct cred *tcred,
 		av |= PROCESS__SETRLIMIT;
 	if (flags & LSM_PRLIMIT_READ)
 		av |= PROCESS__GETRLIMIT;
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    cred_sid(cred), cred_sid(tcred),
 			    SECCLASS_PROCESS, av, NULL);
 }
@@ -6371,7 +6371,7 @@ static int selinux_key_permission(key_ref_t key_ref,
 	key = key_ref_to_ptr(key_ref);
 	ksec = key->security;
 
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, ksec->sid, SECCLASS_KEY, perm, NULL);
 }
 
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 246f9de..005d65c 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -146,6 +146,8 @@ struct task_security_struct {
 
 #define current_selinux_ns (((struct task_security_struct *)current_security())->ns)
 
+#define cred_selinux_ns(cred) (((struct task_security_struct *)(cred)->security)->ns)
+
 #define ss_initialized (current_selinux_ns->initialized)
 
 #ifdef CONFIG_SECURITY_SELINUX_DEVELOP
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 07/10] selinux: support per-namespace inode security structures
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (5 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 06/10] selinux: introduce cred_selinux_ns() and use it Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-02 15:58 ` [RFC 08/10] selinux: support per-namespace superblock " Stephen Smalley
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Extend the inode security structure to include a reference to the
associated selinux namespace, and turn it into a list so that we
can maintain per-inode security state for each namespace.  This is
necessary since the inode SIDs are per-namespace and multiple namespaces
may access the same inodes.

Introduce a find_isec() helper to find the correct inode security
structure for the current selinux namespace, creating one if one
does not already exist.  Update the existing inode_security*()
helpers to use this helper, to pass the per-namespace inode security
structure for initialization, and to return the resulting inode security
structure to the caller.  Replace direct references to inode->i_security
with the appropriate helper or to use the returned result.

This change is problematic in several respects, so it is unclear
it will survive in the final implementation.  Some of the issues are:
1) The inode security structures pin the selinux namespace in memory
for all namespaces that ever access the inode, preventing timely
(or possibly ever) freeing of the namespace.

2) Not everything in the inode security structure needs to be replicated
per namespace, so this is wasteful and potentially confusing (e.g. inode
back pointer, rcu head - but that overlaps with the list which does need
to be per-namespace, sclass unless we anticipate policy-driven file class
assignments in the future as we already have for sockets).

3) It is not safe to sleep from all callers of inode_security*(), and
thus we cannot always allocate a security blob for the namespace or
fetch the xattr on demand.  Thus, we could encounter a memory allocation
failure or fail to fetch the xattr and map it to a SID in the caller's
namespace.   This is not handled safely currently!

4) We only support a single security.selinux xattr, which must be mappable
to a SID in every namespace that accesses the inodes (or it will be mapped
to the unlabeled SID, which must then be accessible if we wish to permit
access from that namespace).

5) We do not yet properly handle setxattr of security.selinux; at present,
it will modify the on-disk xattr but will only update the in-core SID for
the current namespace and could leave other namespaces out of sync until
the inode is evicted and refetched.

Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c          | 137 +++++++++++++++++++++++++++-----------
 security/selinux/include/objsec.h |   4 ++
 security/selinux/selinuxfs.c      |   2 +-
 3 files changed, 104 insertions(+), 39 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index efe8083..8a52e71 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -263,28 +263,83 @@ static inline u32 task_sid(const struct task_struct *task)
 
 /* Allocate and free functions for each kind of security blob. */
 
-static int inode_alloc_security(struct inode *inode)
+static struct inode_security_struct *isec_alloc(bool may_sleep)
 {
 	struct inode_security_struct *isec;
 	u32 sid = current_sid();
+	gfp_t flags = may_sleep ? GFP_NOFS : GFP_NOWAIT;
 
-	isec = kmem_cache_zalloc(sel_inode_cache, GFP_NOFS);
+	isec = kmem_cache_zalloc(sel_inode_cache, flags);
 	if (!isec)
-		return -ENOMEM;
+		return NULL;
 
 	spin_lock_init(&isec->lock);
 	INIT_LIST_HEAD(&isec->list);
-	isec->inode = inode;
 	isec->sid = SECINITSID_UNLABELED;
 	isec->sclass = SECCLASS_FILE;
 	isec->task_sid = sid;
 	isec->initialized = LABEL_INVALID;
-	inode->i_security = isec;
+	isec->ns = get_selinux_ns(current_selinux_ns);
+	INIT_LIST_HEAD(&isec->isec_list);
+	return isec;
+}
+
+static int inode_alloc_security(struct inode *inode)
+{
+	struct inode_security_struct *isec = isec_alloc(true);
+
+	if (!isec)
+		return -ENOMEM;
 
+	isec->inode = inode;
+	inode->i_security = isec;
 	return 0;
 }
 
-static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dentry);
+static int inode_doinit_with_dentry(struct inode *inode,
+				    struct dentry *opt_dentry,
+				    struct inode_security_struct *isec);
+
+static struct inode_security_struct *find_isec(struct inode *inode,
+					       bool may_sleep)
+{
+	struct inode_security_struct *isec = inode->i_security;
+	struct inode_security_struct *cur, *new;
+
+	if (isec->ns == current_selinux_ns)
+		return isec;
+
+	spin_lock(&isec->lock);
+
+	list_for_each_entry(cur, &isec->isec_list, isec_list) {
+		if (cur->ns == current_selinux_ns)
+			goto out;
+	}
+
+	spin_unlock(&isec->lock);
+
+	new = isec_alloc(may_sleep);
+	if (!new) {
+		cur = NULL;
+		goto out;
+	}
+	new->inode = inode;
+
+	spin_lock(&isec->lock);
+
+	list_for_each_entry(cur, &isec->isec_list, isec_list) {
+		if (cur->ns == current_selinux_ns) {
+			kmem_cache_free(sel_inode_cache, new);
+			goto out;
+		}
+	}
+
+	list_add(&new->isec_list, &isec->isec_list);
+	cur = new;
+out:
+	spin_unlock(&isec->lock);
+	return cur;
+}
 
 /*
  * Try reloading inode security labels that have been marked as invalid.  The
@@ -293,57 +348,52 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
  * invalid.  The @opt_dentry parameter should be set to a dentry of the inode;
  * when no dentry is available, set it to NULL instead.
  */
-static int __inode_security_revalidate(struct inode *inode,
-				       struct dentry *opt_dentry,
-				       bool may_sleep)
+static struct inode_security_struct *
+__inode_security_revalidate(struct inode *inode,
+			    struct dentry *opt_dentry,
+			    bool may_sleep)
 {
-	struct inode_security_struct *isec = inode->i_security;
+	struct inode_security_struct *isec = find_isec(inode, may_sleep);
 
 	might_sleep_if(may_sleep);
 
 	if (ss_initialized && isec->initialized != LABEL_INITIALIZED) {
 		if (!may_sleep)
-			return -ECHILD;
+			return ERR_PTR(-ECHILD);
 
 		/*
 		 * Try reloading the inode security label.  This will fail if
 		 * @opt_dentry is NULL and no dentry for this inode can be
 		 * found; in that case, continue using the old label.
 		 */
-		inode_doinit_with_dentry(inode, opt_dentry);
+		inode_doinit_with_dentry(inode, opt_dentry, isec);
 	}
-	return 0;
+	return isec;
 }
 
 static struct inode_security_struct *inode_security_novalidate(struct inode *inode)
 {
-	return inode->i_security;
+	return find_isec(inode, false);
 }
 
 static struct inode_security_struct *inode_security_rcu(struct inode *inode, bool rcu)
 {
-	int error;
-
-	error = __inode_security_revalidate(inode, NULL, !rcu);
-	if (error)
-		return ERR_PTR(error);
-	return inode->i_security;
+	return __inode_security_revalidate(inode, NULL, !rcu);
 }
 
 /*
  * Get the security label of an inode.
  */
-static struct inode_security_struct *inode_security(struct inode *inode)
+struct inode_security_struct *inode_security(struct inode *inode)
 {
-	__inode_security_revalidate(inode, NULL, true);
-	return inode->i_security;
+	return __inode_security_revalidate(inode, NULL, true);
 }
 
 static struct inode_security_struct *backing_inode_security_novalidate(struct dentry *dentry)
 {
 	struct inode *inode = d_backing_inode(dentry);
 
-	return inode->i_security;
+	return find_isec(inode, false);
 }
 
 /*
@@ -353,15 +403,22 @@ static struct inode_security_struct *backing_inode_security(struct dentry *dentr
 {
 	struct inode *inode = d_backing_inode(dentry);
 
-	__inode_security_revalidate(inode, dentry, true);
-	return inode->i_security;
+	return __inode_security_revalidate(inode, dentry, true);
 }
 
 static void inode_free_rcu(struct rcu_head *head)
 {
-	struct inode_security_struct *isec;
+	struct inode_security_struct *isec, *entry, *tmp;
 
 	isec = container_of(head, struct inode_security_struct, rcu);
+
+	list_for_each_entry_safe(entry, tmp, &isec->isec_list, isec_list) {
+		put_selinux_ns(entry->ns);
+		kmem_cache_free(sel_inode_cache, entry);
+	}
+
+	put_selinux_ns(isec->ns);
+
 	kmem_cache_free(sel_inode_cache, isec);
 }
 
@@ -450,7 +507,7 @@ static void superblock_free_security(struct super_block *sb)
 
 static inline int inode_doinit(struct inode *inode)
 {
-	return inode_doinit_with_dentry(inode, NULL);
+	return inode_doinit_with_dentry(inode, NULL, find_isec(inode, true));
 }
 
 enum {
@@ -579,7 +636,8 @@ static int sb_finish_set_opts(struct super_block *sb)
 		sbsec->flags &= ~SBLABEL_MNT;
 
 	/* Initialize the root inode. */
-	rc = inode_doinit_with_dentry(root_inode, root);
+	rc = inode_doinit_with_dentry(root_inode, root, find_isec(root_inode,
+								  true));
 
 	/* Initialize any other inodes associated with the superblock, e.g.
 	   inodes created prior to initial policy load or inodes created
@@ -1529,10 +1587,11 @@ static int selinux_genfs_get_sid(struct dentry *dentry,
 }
 
 /* The inode's security attributes must be initialized before first use. */
-static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dentry)
+static int inode_doinit_with_dentry(struct inode *inode,
+				    struct dentry *opt_dentry,
+				    struct inode_security_struct *isec)
 {
 	struct superblock_security_struct *sbsec = NULL;
-	struct inode_security_struct *isec = inode->i_security;
 	u32 task_sid, sid = 0;
 	u16 sclass;
 	struct dentry *dentry;
@@ -1552,7 +1611,8 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
 		isec->sclass = inode_mode_to_security_class(inode->i_mode);
 
 	sbsec = inode->i_sb->s_security;
-	if (!(sbsec->flags & SE_SBINITIALIZED)) {
+	if (!current_selinux_ns->initialized ||
+	    !(sbsec->flags & SE_SBINITIALIZED)) {
 		/* Defer initialization until selinux_complete_init,
 		   after the initial policy is loaded and the security
 		   server is ready to handle calls. */
@@ -1821,7 +1881,7 @@ static int inode_has_perm(const struct cred *cred,
 		return 0;
 
 	sid = cred_sid(cred);
-	isec = inode->i_security;
+	isec = inode_security_novalidate(inode);
 
 	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, isec->sid, isec->sclass, perms, adp);
@@ -3033,7 +3093,8 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 
 	/* Possibly defer initialization to selinux_complete_init. */
 	if (sbsec->flags & SE_SBINITIALIZED) {
-		struct inode_security_struct *isec = inode->i_security;
+		struct inode_security_struct *isec =
+			inode_security_novalidate(inode);
 		isec->sclass = inode_mode_to_security_class(inode->i_mode);
 		isec->sid = newsid;
 		isec->initialized = LABEL_INITIALIZED;
@@ -3133,7 +3194,7 @@ static noinline int audit_inode_permission(struct inode *inode,
 					   unsigned flags)
 {
 	struct common_audit_data ad;
-	struct inode_security_struct *isec = inode->i_security;
+	struct inode_security_struct *isec = inode_security_novalidate(inode);
 	int rc;
 
 	ad.type = LSM_AUDIT_DATA_INODE;
@@ -4189,7 +4250,7 @@ static int selinux_task_kill(struct task_struct *p, struct siginfo *info,
 static void selinux_task_to_inode(struct task_struct *p,
 				  struct inode *inode)
 {
-	struct inode_security_struct *isec = inode->i_security;
+	struct inode_security_struct *isec = inode_security(inode);
 	u32 sid = task_sid(p);
 
 	spin_lock(&isec->lock);
@@ -6072,7 +6133,7 @@ static void selinux_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid)
 static void selinux_d_instantiate(struct dentry *dentry, struct inode *inode)
 {
 	if (inode)
-		inode_doinit_with_dentry(inode, dentry);
+		inode_doinit_with_dentry(inode, dentry, find_isec(inode, true));
 }
 
 static int selinux_getprocattr(struct task_struct *p,
@@ -6289,7 +6350,7 @@ static void selinux_release_secctx(char *secdata, u32 seclen)
 
 static void selinux_inode_invalidate_secctx(struct inode *inode)
 {
-	struct inode_security_struct *isec = inode->i_security;
+	struct inode_security_struct *isec = inode_security_novalidate(inode);
 
 	spin_lock(&isec->lock);
 	isec->initialized = LABEL_INVALID;
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 051b804..04514ee 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -56,6 +56,8 @@ struct inode_security_struct {
 	u16 sclass;		/* security class of this object */
 	unsigned char initialized;	/* initialization flag */
 	spinlock_t lock;
+	struct selinux_ns *ns;
+	struct list_head isec_list;
 };
 
 struct file_security_struct {
@@ -141,4 +143,6 @@ struct pkey_security_struct {
 	u32	sid;	/* SID of pkey */
 };
 
+struct inode_security_struct *inode_security(struct inode *inode);
+
 #endif /* _SELINUX_OBJSEC_H_ */
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 90424454..a7e6bdb 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -1387,7 +1387,7 @@ static int sel_make_bools(struct selinux_fs_info *fsi)
 		if (len >= PAGE_SIZE)
 			goto out;
 
-		isec = (struct inode_security_struct *)inode->i_security;
+		isec = inode_security(inode);
 		ret = security_genfs_sid(fsi->ns, "selinuxfs", page,
 					 SECCLASS_FILE, &sid);
 		if (ret) {
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 08/10] selinux: support per-namespace superblock security structures
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (6 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 07/10] selinux: support per-namespace inode security structures Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-02 15:58 ` [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
  2017-10-02 15:58 ` [RFC 10/10] selinuxfs: restrict write operations to the same " Stephen Smalley
  9 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Extend the superblock security structure to include a reference
to the associated selinux namespace, and turn it into a list so
that we can maintain per-superblock security state for each namespace.
This is necessary because the superblock SIDs and labeling behavior
are per selinux namespace.  It further enables one to context-mount
a filesystem with a particular context in one namespace while using
xattrs in another, e.g. one might context mount a container filesystem
in the init selinux namespace to provide MCS-style isolation of the
containers while using per-file xattrs within the container to support
conventional SELinux targeted policy.

Introduce a superblock_security() helper to return the superblock
security blob for the current selinux namespace and replace direct uses
of sb->s_security with calls to it.

Also revert the changes made by
commit a64c54cf0811b8032fdab8c9d52576f0370837fa ("SELinux: pass a
superblock to security_fs_use") so that access to the superblock
security structure is properly encapsulated and we can support
per-namespace structures.

This change has similar problems as with the inode security structure
change, see the list of issues in that commit.

Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c            | 109 ++++++++++++++++++++++++++++--------
 security/selinux/include/objsec.h   |   5 +-
 security/selinux/include/security.h |   3 +-
 security/selinux/ss/services.c      |  19 ++++---
 4 files changed, 102 insertions(+), 34 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 8a52e71..3daad14 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -478,33 +478,90 @@ static void file_free_security(struct file *file)
 	kmem_cache_free(file_security_cache, fsec);
 }
 
-static int superblock_alloc_security(struct super_block *sb)
+static struct superblock_security_struct *sbsec_alloc(
+	const struct super_block *sb)
 {
 	struct superblock_security_struct *sbsec;
 
-	sbsec = kzalloc(sizeof(struct superblock_security_struct), GFP_KERNEL);
+	sbsec = kzalloc(sizeof(struct superblock_security_struct), GFP_NOFS);
 	if (!sbsec)
-		return -ENOMEM;
+		return NULL;
 
 	mutex_init(&sbsec->lock);
 	INIT_LIST_HEAD(&sbsec->isec_head);
 	spin_lock_init(&sbsec->isec_lock);
-	sbsec->sb = sb;
 	sbsec->sid = SECINITSID_UNLABELED;
 	sbsec->def_sid = SECINITSID_FILE;
 	sbsec->mntpoint_sid = SECINITSID_UNLABELED;
-	sb->s_security = sbsec;
+	sbsec->sb = sb;
+	sbsec->ns = get_selinux_ns(current_selinux_ns);
+	INIT_LIST_HEAD(&sbsec->sbsec_list);
+	return sbsec;
+}
+
+static int superblock_alloc_security(struct super_block *sb)
+{
+	struct superblock_security_struct *sbsec = sbsec_alloc(sb);
+
+	if (!sbsec)
+		return -ENOMEM;
 
+	sb->s_security = sbsec;
 	return 0;
 }
 
 static void superblock_free_security(struct super_block *sb)
 {
-	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *sbsec = sb->s_security, *entry, *tmp;
 	sb->s_security = NULL;
+	put_selinux_ns(sbsec->ns);
+	list_for_each_entry_safe(entry, tmp, &sbsec->sbsec_list, sbsec_list) {
+		put_selinux_ns(entry->ns);
+		kfree(entry);
+	}
 	kfree(sbsec);
 }
 
+static struct superblock_security_struct *superblock_security(
+	const struct super_block *sb)
+{
+	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *cur, *new;
+
+	if (sbsec->ns == current_selinux_ns)
+		return sbsec;
+
+	spin_lock(&sbsec->sbsec_list_lock);
+
+	list_for_each_entry(cur, &sbsec->sbsec_list, sbsec_list) {
+		if (cur->ns == current_selinux_ns)
+			goto out;
+	}
+
+	spin_unlock(&sbsec->sbsec_list_lock);
+
+	new = sbsec_alloc(sb);
+	if (!new) {
+		cur = NULL;
+		goto out;
+	}
+
+	spin_lock(&sbsec->sbsec_list_lock);
+
+	list_for_each_entry(cur, &sbsec->sbsec_list, sbsec_list) {
+		if (cur->ns == current_selinux_ns) {
+			kfree(new);
+			goto out;
+		}
+	}
+
+	list_add(&new->sbsec_list, &sbsec->sbsec_list);
+	cur = new;
+out:
+	spin_unlock(&sbsec->sbsec_list_lock);
+	return cur;
+}
+
 static inline int inode_doinit(struct inode *inode)
 {
 	return inode_doinit_with_dentry(inode, NULL, find_isec(inode, true));
@@ -572,7 +629,7 @@ static int may_context_mount_inode_relabel(u32 sid,
 
 static int selinux_is_sblabel_mnt(struct super_block *sb)
 {
-	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *sbsec = superblock_security(sb);
 
 	return sbsec->behavior == SECURITY_FS_USE_XATTR ||
 		sbsec->behavior == SECURITY_FS_USE_TRANS ||
@@ -591,7 +648,7 @@ static int selinux_is_sblabel_mnt(struct super_block *sb)
 
 static int sb_finish_set_opts(struct super_block *sb)
 {
-	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *sbsec = superblock_security(sb);
 	struct dentry *root = sb->s_root;
 	struct inode *root_inode = d_backing_inode(root);
 	int rc = 0;
@@ -675,7 +732,7 @@ static int selinux_get_mnt_opts(const struct super_block *sb,
 				struct security_mnt_opts *opts)
 {
 	int rc = 0, i;
-	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *sbsec = superblock_security(sb);
 	char *context = NULL;
 	u32 len;
 	char tmp;
@@ -796,7 +853,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
 {
 	const struct cred *cred = current_cred();
 	int rc = 0, i;
-	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *sbsec = superblock_security(sb);
 	const char *name = sb->s_type->name;
 	struct dentry *root = sbsec->sb->s_root;
 	struct inode_security_struct *root_isec;
@@ -932,7 +989,8 @@ static int selinux_set_mnt_opts(struct super_block *sb,
 		 * Determine the labeling behavior to use for this
 		 * filesystem type.
 		 */
-		rc = security_fs_use(current_selinux_ns, sb);
+		rc = security_fs_use(current_selinux_ns, sb->s_type->name,
+				     &sbsec->behavior, &sbsec->sid);
 		if (rc) {
 			printk(KERN_WARNING
 				"%s: security_fs_use(%s) returned %d\n",
@@ -1051,8 +1109,8 @@ static int selinux_set_mnt_opts(struct super_block *sb,
 static int selinux_cmp_sb_context(const struct super_block *oldsb,
 				    const struct super_block *newsb)
 {
-	struct superblock_security_struct *old = oldsb->s_security;
-	struct superblock_security_struct *new = newsb->s_security;
+	struct superblock_security_struct *old = superblock_security(oldsb);
+	struct superblock_security_struct *new = superblock_security(newsb);
 	char oldflags = old->flags & SE_MNTMASK;
 	char newflags = new->flags & SE_MNTMASK;
 
@@ -1084,8 +1142,10 @@ static int selinux_sb_clone_mnt_opts(const struct super_block *oldsb,
 					unsigned long *set_kern_flags)
 {
 	int rc = 0;
-	const struct superblock_security_struct *oldsbsec = oldsb->s_security;
-	struct superblock_security_struct *newsbsec = newsb->s_security;
+	const struct superblock_security_struct *oldsbsec =
+		superblock_security(oldsb);
+	struct superblock_security_struct *newsbsec =
+		superblock_security(newsb);
 
 	int set_fscontext =	(oldsbsec->flags & FSCONTEXT_MNT);
 	int set_context =	(oldsbsec->flags & CONTEXT_MNT);
@@ -1122,7 +1182,8 @@ static int selinux_sb_clone_mnt_opts(const struct super_block *oldsb,
 
 	if (newsbsec->behavior == SECURITY_FS_USE_NATIVE &&
 		!(kern_flags & SECURITY_LSM_NATIVE_LABELS) && !set_context) {
-		rc = security_fs_use(current_selinux_ns, newsb);
+		rc = security_fs_use(current_selinux_ns, newsb->s_type->name,
+				     &newsbsec->behavior, &newsbsec->sid);
 		if (rc)
 			goto out;
 	}
@@ -1603,6 +1664,8 @@ static int inode_doinit_with_dentry(struct inode *inode,
 	if (isec->initialized == LABEL_INITIALIZED)
 		return 0;
 
+	sbsec = superblock_security(inode->i_sb);
+
 	spin_lock(&isec->lock);
 	if (isec->initialized == LABEL_INITIALIZED)
 		goto out_unlock;
@@ -1610,7 +1673,6 @@ static int inode_doinit_with_dentry(struct inode *inode,
 	if (isec->sclass == SECCLASS_FILE)
 		isec->sclass = inode_mode_to_security_class(inode->i_mode);
 
-	sbsec = inode->i_sb->s_security;
 	if (!current_selinux_ns->initialized ||
 	    !(sbsec->flags & SE_SBINITIALIZED)) {
 		/* Defer initialization until selinux_complete_init,
@@ -1980,7 +2042,8 @@ selinux_determine_inode_label(const struct task_security_struct *tsec,
 				 const struct qstr *name, u16 tclass,
 				 u32 *_new_isid)
 {
-	const struct superblock_security_struct *sbsec = dir->i_sb->s_security;
+	const struct superblock_security_struct *sbsec =
+		superblock_security(dir->i_sb);
 
 	if ((sbsec->flags & SE_SBINITIALIZED) &&
 	    (sbsec->behavior == SECURITY_FS_USE_MNTPOINT)) {
@@ -2011,7 +2074,7 @@ static int may_create(struct inode *dir,
 	int rc;
 
 	dsec = inode_security(dir);
-	sbsec = dir->i_sb->s_security;
+	sbsec = superblock_security(dir->i_sb);
 
 	sid = tsec->sid;
 
@@ -2160,7 +2223,7 @@ static int superblock_has_perm(const struct cred *cred,
 	struct superblock_security_struct *sbsec;
 	u32 sid = cred_sid(cred);
 
-	sbsec = sb->s_security;
+	sbsec = superblock_security(sb);
 	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
 }
@@ -2885,7 +2948,7 @@ static int selinux_sb_remount(struct super_block *sb, void *data)
 	int rc, i, *flags;
 	struct security_mnt_opts opts;
 	char *secdata, **mount_options;
-	struct superblock_security_struct *sbsec = sb->s_security;
+	struct superblock_security_struct *sbsec = superblock_security(sb);
 
 	if (!(sbsec->flags & SE_SBINITIALIZED))
 		return 0;
@@ -3079,7 +3142,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
 	int rc;
 	char *context;
 
-	sbsec = dir->i_sb->s_security;
+	sbsec = superblock_security(dir->i_sb);
 
 	sid = tsec->sid;
 	newsid = tsec->create_sid;
@@ -3332,7 +3395,7 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	if (strcmp(name, XATTR_NAME_SELINUX))
 		return selinux_inode_setotherxattr(dentry, name);
 
-	sbsec = inode->i_sb->s_security;
+	sbsec = superblock_security(inode->i_sb);
 	if (!(sbsec->flags & SBLABEL_MNT))
 		return -EOPNOTSUPP;
 
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 04514ee..dba80d3 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -68,7 +68,7 @@ struct file_security_struct {
 };
 
 struct superblock_security_struct {
-	struct super_block *sb;		/* back pointer to sb object */
+	const struct super_block *sb;	/* back pointer to sb object */
 	u32 sid;			/* SID of file system superblock */
 	u32 def_sid;			/* default SID for labeling */
 	u32 mntpoint_sid;		/* SECURITY_FS_USE_MNTPOINT context for files */
@@ -77,6 +77,9 @@ struct superblock_security_struct {
 	struct mutex lock;
 	struct list_head isec_head;
 	spinlock_t isec_lock;
+	struct selinux_ns *ns;
+	struct list_head sbsec_list;
+	spinlock_t sbsec_list_lock;
 };
 
 struct msg_security_struct {
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 005d65c..b80f9bd 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -324,7 +324,8 @@ int security_get_allow_unknown(struct selinux_ns *ns);
 #define SECURITY_FS_USE_NATIVE		7 /* use native label support */
 #define SECURITY_FS_USE_MAX		7 /* Highest SECURITY_FS_USE_XXX */
 
-int security_fs_use(struct selinux_ns *ns, struct super_block *sb);
+int security_fs_use(struct selinux_ns *ns,
+		    const char *fstype, unsigned short *behavior, u32 *sid);
 
 int security_genfs_sid(struct selinux_ns *ns,
 		       const char *fstype, char *name, u16 sclass,
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index e1c3881..abc5383 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -2770,16 +2770,17 @@ int security_genfs_sid(struct selinux_ns *ns,
 
 /**
  * security_fs_use - Determine how to handle labeling for a filesystem.
- * @sb: superblock in question
+ * @fstype: filesystem type
+ * @behavior: labeling behavior
+ * @sid: SID for filesystem (superblock)
  */
-int security_fs_use(struct selinux_ns *ns, struct super_block *sb)
+int security_fs_use(struct selinux_ns *ns, const char *fstype,
+		    unsigned short *behavior, u32 *sid)
 {
 	struct policydb *policydb;
 	struct sidtab *sidtab;
 	int rc = 0;
 	struct ocontext *c;
-	struct superblock_security_struct *sbsec = sb->s_security;
-	const char *fstype = sb->s_type->name;
 
 	read_lock(&ns->ss->policy_rwlock);
 
@@ -2794,22 +2795,22 @@ int security_fs_use(struct selinux_ns *ns, struct super_block *sb)
 	}
 
 	if (c) {
-		sbsec->behavior = c->v.behavior;
+		*behavior = c->v.behavior;
 		if (!c->sid[0]) {
 			rc = sidtab_context_to_sid(sidtab, &c->context[0],
 						   &c->sid[0]);
 			if (rc)
 				goto out;
 		}
-		sbsec->sid = c->sid[0];
+		*sid = c->sid[0];
 	} else {
 		rc = __security_genfs_sid(ns, fstype, "/", SECCLASS_DIR,
-					  &sbsec->sid);
+					  sid);
 		if (rc) {
-			sbsec->behavior = SECURITY_FS_USE_NONE;
+			*behavior = SECURITY_FS_USE_NONE;
 			rc = 0;
 		} else {
-			sbsec->behavior = SECURITY_FS_USE_GENFS;
+			*behavior = SECURITY_FS_USE_GENFS;
 		}
 	}
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (7 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 08/10] selinux: support per-namespace superblock " Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  2017-10-02 23:56   ` Casey Schaufler
  2017-10-05 15:27   ` Stephen Smalley
  2017-10-02 15:58 ` [RFC 10/10] selinuxfs: restrict write operations to the same " Stephen Smalley
  9 siblings, 2 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

Provide a userspace API to unshare the selinux namespace.
Currently implemented via a selinuxfs node. This could be
coupled with unsharing of other namespaces (e.g.  mount namespace,
network namespace) that will always be needed or left independent.
Don't get hung up on the interface itself, it is just to allow
experimentation and testing.

Sample usage:
echo 1 > /sys/fs/selinux/unshare
unshare -m -n
umount /sys/fs/selinux
mount -t selinuxfs none /sys/fs/selinux
load_policy
getenforce
id
echo $$

The above will show that the process now views itself as running in the
kernel domain in permissive mode, as would be the case at boot.
>From a different shell on the host system, running ps -eZ or
cat /proc/<pid>/attr/current will show that the process that
unshared its selinux namespace is still running in its original
context in the initial namespace, and getenforce will show the
the initial namespace remains enforcing.  Enforcing mode or policy
changes in the child will not affect the parent.

This is not yet safe; do not use on production systems.
Known issues include at least the following items:

* The policy loading code has not been thoroughly audited
and hardened for use by unprivileged code, both with respect to
ensuring that the policy is internally consistent and restricting
the range of values used from the policy as loop bounds and memory
allocation sizes to sane limits.

* The SELinux hook functions have not been modified to be
namespace-aware, so the hooks only perform checking against the
current namespace.  Thus, unsharing allows the process to escape
confinement by the parent.  Fixing this requires updating each hook to
perform its processing on the current namespace and all of its ancestors
up to the init namespace.

* Some of the hook functions can be called outside of process context
(e.g. task_kill, send_sigiotask, network input/forward) and should not use
the current task's selinux namespace. These hooks need to be updated to
obtain the proper selinux namespace to use instead from the caller or
cached in a suitable data structure (e.g. the file or sock security
structures).

* There are number of issues with the inode and superblock security blob
handling for multiple namespaces, see those commits for more details.

* Only a subset of object security blobs have been updated to
be namespace-aware and support multiple namespaces.  The ones that
have not yet been updated could end up performing permission checks or
other operations on SIDs created in a different selinux namespace.

* The network SID caches (netif, netnode, netport) have not yet
been instantiated per selinux namespace, unlike the AVC and SS.

* There is no way currently to restrict or bound nesting of
namespaces; if you allow it to a domain in the init namespace,
then that domain can in turn unshare to arbitrary depths and can
grant the same to any domain in its own policy.  Related to this
is the fact that there is no way to control resource usage due to
selinux namespaces and they can be substantial (per-namespace
policydb, sidtab, AVC, etc).

* SIDs may be cached by audit and networking code and in external
kernel data structures and used later, potentially in a different
selinux namespace than the one in which the SID was originally created.

* No doubt other things I'm forgetting or haven't thought of.
Use at your own risk.

Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/include/classmap.h |  3 +-
 security/selinux/selinuxfs.c        | 66 +++++++++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 35ffb29..82c8f9c 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] = {
 	  { "compute_av", "compute_create", "compute_member",
 	    "check_context", "load_policy", "compute_relabel",
 	    "compute_user", "setenforce", "setbool", "setsecparam",
-	    "setcheckreqprot", "read_policy", "validate_trans", NULL } },
+	    "setcheckreqprot", "read_policy", "validate_trans", "unshare",
+	    NULL } },
 	{ "process",
 	  { "fork", "transition", "sigchld", "sigkill",
 	    "sigstop", "signull", "signal", "ptrace", "getsched", "setsched",
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index a7e6bdb..dedb3cc9 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -63,6 +63,7 @@ enum sel_inos {
 	SEL_STATUS,	/* export current status using mmap() */
 	SEL_POLICY,	/* allow userspace to read the in kernel policy */
 	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
+	SEL_UNSHARE,	    /* unshare selinux namespace */
 	SEL_INO_NEXT,	/* The next inode number to use */
 };
 
@@ -321,6 +322,70 @@ static const struct file_operations sel_disable_ops = {
 	.llseek		= generic_file_llseek,
 };
 
+static ssize_t sel_write_unshare(struct file *file, const char __user *buf,
+				 size_t count, loff_t *ppos)
+
+{
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
+	char *page;
+	ssize_t length;
+	bool set;
+	int rc;
+
+	if (count >= PAGE_SIZE)
+		return -ENOMEM;
+
+	/* No partial writes. */
+	if (*ppos != 0)
+		return -EINVAL;
+
+	rc = avc_has_perm(current_selinux_ns, current_sid(),
+			  SECINITSID_SECURITY, SECCLASS_SECURITY,
+			  SECURITY__UNSHARE, NULL);
+	if (rc)
+		return rc;
+
+	page = memdup_user_nul(buf, count);
+	if (IS_ERR(page))
+		return PTR_ERR(page);
+
+	length = -EINVAL;
+	if (kstrtobool(page, &set))
+		goto out;
+
+	if (set) {
+		struct cred *cred = prepare_creds();
+		struct task_security_struct *tsec;
+
+		if (!cred) {
+			length = -ENOMEM;
+			goto out;
+		}
+		tsec = cred->security;
+		if (selinux_ns_create(ns, &tsec->ns)) {
+			abort_creds(cred);
+			length = -ENOMEM;
+			goto out;
+		}
+		tsec->osid = tsec->sid = SECINITSID_KERNEL;
+		tsec->exec_sid = tsec->create_sid = tsec->keycreate_sid =
+			tsec->sockcreate_sid = SECSID_NULL;
+		tsec->parent_cred = get_current_cred();
+		commit_creds(cred);
+	}
+
+	length = count;
+out:
+	kfree(page);
+	return length;
+}
+
+static const struct file_operations sel_unshare_ops = {
+	.write		= sel_write_unshare,
+	.llseek		= generic_file_llseek,
+};
+
 static ssize_t sel_read_policyvers(struct file *filp, char __user *buf,
 				   size_t count, loff_t *ppos)
 {
@@ -1923,6 +1988,7 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 		[SEL_POLICY] = {"policy", &sel_policy_ops, S_IRUGO},
 		[SEL_VALIDATE_TRANS] = {"validatetrans", &sel_transition_ops,
 					S_IWUGO},
+		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops, 0222},
 		/* last one */ {""}
 	};
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [RFC 10/10] selinuxfs: restrict write operations to the same selinux namespace
  2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
                   ` (8 preceding siblings ...)
  2017-10-02 15:58 ` [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
@ 2017-10-02 15:58 ` Stephen Smalley
  9 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-02 15:58 UTC (permalink / raw)
  To: selinux; +Cc: paul, jmorris, Stephen Smalley

This ensures that once a process unshares its selinux namespace,
it can no longer act on the parent namespace's selinuxfs instance,
irrespective of policy.  This is a safety measure so that even if
an otherwise unconfined process unshares its selinux namespace, it
won't be able to subsequently affect the enforcing mode or policy of the
parent.  This also helps avoid common mistakes like failing to create
a mount namespace and mount a new selinuxfs instance in order to act
on one's own selinux namespace after unsharing.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/selinuxfs.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index dedb3cc9..6c52d24 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -144,6 +144,9 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 	ssize_t length;
 	int new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -283,6 +286,9 @@ static ssize_t sel_write_disable(struct file *file, const char __user *buf,
 	ssize_t length;
 	int new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -333,6 +339,9 @@ static ssize_t sel_write_unshare(struct file *file, const char __user *buf,
 	bool set;
 	int rc;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -605,6 +614,9 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf,
 	ssize_t length;
 	void *data = NULL;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_selinux_ns,
@@ -716,6 +728,9 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 	ssize_t length;
 	unsigned int new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	length = avc_has_perm(current_selinux_ns,
 			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETCHECKREQPROT,
@@ -762,6 +777,9 @@ static ssize_t sel_write_validatetrans(struct file *file,
 	u16 tclass;
 	int rc;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	rc = avc_has_perm(current_selinux_ns,
 			  current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__VALIDATE_TRANS, NULL);
@@ -849,6 +867,8 @@ static ssize_t (*write_op[])(struct file *, char *, size_t) = {
 
 static ssize_t selinux_transaction_write(struct file *file, const char __user *buf, size_t size, loff_t *pos)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	ino_t ino = file_inode(file)->i_ino;
 	char *data;
 	ssize_t rv;
@@ -856,6 +876,9 @@ static ssize_t selinux_transaction_write(struct file *file, const char __user *b
 	if (ino >= ARRAY_SIZE(write_op) || !write_op[ino])
 		return -EINVAL;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	data = simple_transaction_get(file, buf, size);
 	if (IS_ERR(data))
 		return PTR_ERR(data);
@@ -1279,12 +1302,16 @@ static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
 			      size_t count, loff_t *ppos)
 {
 	struct selinux_fs_info *fsi = file_inode(filep)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
 	char *page = NULL;
 	ssize_t length;
 	int new_value;
 	unsigned index = file_inode(filep)->i_ino & SEL_INO_MASK;
 	const char *name = filep->f_path.dentry->d_name.name;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_selinux_ns,
@@ -1347,6 +1374,9 @@ static ssize_t sel_commit_bools_write(struct file *filep,
 	ssize_t length;
 	int new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_selinux_ns,
@@ -1511,6 +1541,9 @@ static ssize_t sel_write_avc_cache_threshold(struct file *file,
 	ssize_t ret;
 	unsigned int new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	ret = avc_has_perm(current_selinux_ns,
 			   current_sid(), SECINITSID_SECURITY,
 			   SECCLASS_SECURITY, SECURITY__SETSECPARAM,
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-02 15:58 ` [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
@ 2017-10-02 23:56   ` Casey Schaufler
  2017-10-03 12:29     ` Stephen Smalley
  2017-10-05 15:27   ` Stephen Smalley
  1 sibling, 1 reply; 39+ messages in thread
From: Casey Schaufler @ 2017-10-02 23:56 UTC (permalink / raw)
  To: Stephen Smalley, selinux

On 10/2/2017 8:58 AM, Stephen Smalley wrote:
> Provide a userspace API to unshare the selinux namespace.
> Currently implemented via a selinuxfs node. This could be
> coupled with unsharing of other namespaces (e.g.  mount namespace,
> network namespace) that will always be needed or left independent.
> Don't get hung up on the interface itself, it is just to allow
> experimentation and testing.
>
> Sample usage:
> echo 1 > /sys/fs/selinux/unshare
> unshare -m -n
> umount /sys/fs/selinux
> mount -t selinuxfs none /sys/fs/selinux
> load_policy
> getenforce
> id
> echo $$
>
> The above will show that the process now views itself as running in the
> kernel domain in permissive mode, as would be the case at boot.
> >From a different shell on the host system, running ps -eZ or
> cat /proc/<pid>/attr/current will show that the process that
> unshared its selinux namespace is still running in its original
> context in the initial namespace, and getenforce will show the
> the initial namespace remains enforcing.  Enforcing mode or policy
> changes in the child will not affect the parent.
>
> This is not yet safe; do not use on production systems.
> Known issues include at least the following items:
>
> * The policy loading code has not been thoroughly audited
> and hardened for use by unprivileged code, both with respect to
> ensuring that the policy is internally consistent and restricting
> the range of values used from the policy as loop bounds and memory
> allocation sizes to sane limits.
>
> * The SELinux hook functions have not been modified to be
> namespace-aware, so the hooks only perform checking against the
> current namespace.  Thus, unsharing allows the process to escape
> confinement by the parent.  Fixing this requires updating each hook to
> perform its processing on the current namespace and all of its ancestors
> up to the init namespace.
>
> * Some of the hook functions can be called outside of process context
> (e.g. task_kill, send_sigiotask, network input/forward) and should not use
> the current task's selinux namespace. These hooks need to be updated to
> obtain the proper selinux namespace to use instead from the caller or
> cached in a suitable data structure (e.g. the file or sock security
> structures).
>
> * There are number of issues with the inode and superblock security blob
> handling for multiple namespaces, see those commits for more details.
>
> * Only a subset of object security blobs have been updated to
> be namespace-aware and support multiple namespaces.  The ones that
> have not yet been updated could end up performing permission checks or
> other operations on SIDs created in a different selinux namespace.
>
> * The network SID caches (netif, netnode, netport) have not yet
> been instantiated per selinux namespace, unlike the AVC and SS.
>
> * There is no way currently to restrict or bound nesting of
> namespaces; if you allow it to a domain in the init namespace,
> then that domain can in turn unshare to arbitrary depths and can
> grant the same to any domain in its own policy.  Related to this
> is the fact that there is no way to control resource usage due to
> selinux namespaces and they can be substantial (per-namespace
> policydb, sidtab, AVC, etc).
>
> * SIDs may be cached by audit and networking code and in external
> kernel data structures and used later, potentially in a different
> selinux namespace than the one in which the SID was originally created.

Is there a good reason that SIDs (and security contexts) need to
be maintained separately in the namespaces? Using the same secid
to map to a different context depending on the namespace seems like
you're asking for trouble you don't need. A namespace that hasn't
policy for a context/SID won't use it if it is defined for a
different namespace, and should detect the fact if it somehow gets
referenced, because it isn't in the policy.

Do the context/SID mapping globally. Or, if you must duplicate contexts,
allocate the SIDs from a single source so that they aren't ambiguous.

>
> * No doubt other things I'm forgetting or haven't thought of.
> Use at your own risk.
>
> Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
> ---
>  security/selinux/include/classmap.h |  3 +-
>  security/selinux/selinuxfs.c        | 66 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
> index 35ffb29..82c8f9c 100644
> --- a/security/selinux/include/classmap.h
> +++ b/security/selinux/include/classmap.h
> @@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] = {
>  	  { "compute_av", "compute_create", "compute_member",
>  	    "check_context", "load_policy", "compute_relabel",
>  	    "compute_user", "setenforce", "setbool", "setsecparam",
> -	    "setcheckreqprot", "read_policy", "validate_trans", NULL } },
> +	    "setcheckreqprot", "read_policy", "validate_trans", "unshare",
> +	    NULL } },
>  	{ "process",
>  	  { "fork", "transition", "sigchld", "sigkill",
>  	    "sigstop", "signull", "signal", "ptrace", "getsched", "setsched",
> diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
> index a7e6bdb..dedb3cc9 100644
> --- a/security/selinux/selinuxfs.c
> +++ b/security/selinux/selinuxfs.c
> @@ -63,6 +63,7 @@ enum sel_inos {
>  	SEL_STATUS,	/* export current status using mmap() */
>  	SEL_POLICY,	/* allow userspace to read the in kernel policy */
>  	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
> +	SEL_UNSHARE,	    /* unshare selinux namespace */
>  	SEL_INO_NEXT,	/* The next inode number to use */
>  };
>  
> @@ -321,6 +322,70 @@ static const struct file_operations sel_disable_ops = {
>  	.llseek		= generic_file_llseek,
>  };
>  
> +static ssize_t sel_write_unshare(struct file *file, const char __user *buf,
> +				 size_t count, loff_t *ppos)
> +
> +{
> +	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
> +	struct selinux_ns *ns = fsi->ns;
> +	char *page;
> +	ssize_t length;
> +	bool set;
> +	int rc;
> +
> +	if (count >= PAGE_SIZE)
> +		return -ENOMEM;
> +
> +	/* No partial writes. */
> +	if (*ppos != 0)
> +		return -EINVAL;
> +
> +	rc = avc_has_perm(current_selinux_ns, current_sid(),
> +			  SECINITSID_SECURITY, SECCLASS_SECURITY,
> +			  SECURITY__UNSHARE, NULL);
> +	if (rc)
> +		return rc;
> +
> +	page = memdup_user_nul(buf, count);
> +	if (IS_ERR(page))
> +		return PTR_ERR(page);
> +
> +	length = -EINVAL;
> +	if (kstrtobool(page, &set))
> +		goto out;
> +
> +	if (set) {
> +		struct cred *cred = prepare_creds();
> +		struct task_security_struct *tsec;
> +
> +		if (!cred) {
> +			length = -ENOMEM;
> +			goto out;
> +		}
> +		tsec = cred->security;
> +		if (selinux_ns_create(ns, &tsec->ns)) {
> +			abort_creds(cred);
> +			length = -ENOMEM;
> +			goto out;
> +		}
> +		tsec->osid = tsec->sid = SECINITSID_KERNEL;
> +		tsec->exec_sid = tsec->create_sid = tsec->keycreate_sid =
> +			tsec->sockcreate_sid = SECSID_NULL;
> +		tsec->parent_cred = get_current_cred();
> +		commit_creds(cred);
> +	}
> +
> +	length = count;
> +out:
> +	kfree(page);
> +	return length;
> +}
> +
> +static const struct file_operations sel_unshare_ops = {
> +	.write		= sel_write_unshare,
> +	.llseek		= generic_file_llseek,
> +};
> +
>  static ssize_t sel_read_policyvers(struct file *filp, char __user *buf,
>  				   size_t count, loff_t *ppos)
>  {
> @@ -1923,6 +1988,7 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
>  		[SEL_POLICY] = {"policy", &sel_policy_ops, S_IRUGO},
>  		[SEL_VALIDATE_TRANS] = {"validatetrans", &sel_transition_ops,
>  					S_IWUGO},
> +		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops, 0222},
>  		/* last one */ {""}
>  	};
>  


.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-02 23:56   ` Casey Schaufler
@ 2017-10-03 12:29     ` Stephen Smalley
  2017-10-03 17:14       ` Casey Schaufler
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-03 12:29 UTC (permalink / raw)
  To: Casey Schaufler, selinux, James Morris, Paul Moore

On Mon, 2017-10-02 at 16:56 -0700, Casey Schaufler wrote:
> On 10/2/2017 8:58 AM, Stephen Smalley wrote:
> > Provide a userspace API to unshare the selinux namespace.
> > Currently implemented via a selinuxfs node. This could be
> > coupled with unsharing of other namespaces (e.g.  mount namespace,
> > network namespace) that will always be needed or left independent.
> > Don't get hung up on the interface itself, it is just to allow
> > experimentation and testing.
> > 
> > Sample usage:
> > echo 1 > /sys/fs/selinux/unshare
> > unshare -m -n
> > umount /sys/fs/selinux
> > mount -t selinuxfs none /sys/fs/selinux
> > load_policy
> > getenforce
> > id
> > echo $$
> > 
> > The above will show that the process now views itself as running in
> > the
> > kernel domain in permissive mode, as would be the case at boot.
> > > From a different shell on the host system, running ps -eZ or
> > 
> > cat /proc/<pid>/attr/current will show that the process that
> > unshared its selinux namespace is still running in its original
> > context in the initial namespace, and getenforce will show the
> > the initial namespace remains enforcing.  Enforcing mode or policy
> > changes in the child will not affect the parent.
> > 
> > This is not yet safe; do not use on production systems.
> > Known issues include at least the following items:
> > 
> > * The policy loading code has not been thoroughly audited
> > and hardened for use by unprivileged code, both with respect to
> > ensuring that the policy is internally consistent and restricting
> > the range of values used from the policy as loop bounds and memory
> > allocation sizes to sane limits.
> > 
> > * The SELinux hook functions have not been modified to be
> > namespace-aware, so the hooks only perform checking against the
> > current namespace.  Thus, unsharing allows the process to escape
> > confinement by the parent.  Fixing this requires updating each hook
> > to
> > perform its processing on the current namespace and all of its
> > ancestors
> > up to the init namespace.
> > 
> > * Some of the hook functions can be called outside of process
> > context
> > (e.g. task_kill, send_sigiotask, network input/forward) and should
> > not use
> > the current task's selinux namespace. These hooks need to be
> > updated to
> > obtain the proper selinux namespace to use instead from the caller
> > or
> > cached in a suitable data structure (e.g. the file or sock security
> > structures).
> > 
> > * There are number of issues with the inode and superblock security
> > blob
> > handling for multiple namespaces, see those commits for more
> > details.
> > 
> > * Only a subset of object security blobs have been updated to
> > be namespace-aware and support multiple namespaces.  The ones that
> > have not yet been updated could end up performing permission checks
> > or
> > other operations on SIDs created in a different selinux namespace.
> > 
> > * The network SID caches (netif, netnode, netport) have not yet
> > been instantiated per selinux namespace, unlike the AVC and SS.
> > 
> > * There is no way currently to restrict or bound nesting of
> > namespaces; if you allow it to a domain in the init namespace,
> > then that domain can in turn unshare to arbitrary depths and can
> > grant the same to any domain in its own policy.  Related to this
> > is the fact that there is no way to control resource usage due to
> > selinux namespaces and they can be substantial (per-namespace
> > policydb, sidtab, AVC, etc).
> > 
> > * SIDs may be cached by audit and networking code and in external
> > kernel data structures and used later, potentially in a different
> > selinux namespace than the one in which the SID was originally
> > created.
> 
> Is there a good reason that SIDs (and security contexts) need to
> be maintained separately in the namespaces? Using the same secid
> to map to a different context depending on the namespace seems like
> you're asking for trouble you don't need. A namespace that hasn't
> policy for a context/SID won't use it if it is defined for a
> different namespace, and should detect the fact if it somehow gets
> referenced, because it isn't in the policy.
> 
> Do the context/SID mapping globally. Or, if you must duplicate
> contexts,
> allocate the SIDs from a single source so that they aren't ambiguous.

Yes, that may be the right answer, but it requires introducing a new
SID/context layer above the current security server layer (or changing
the latter), because the security server SID/context mappings are from
SIDs to internal struct representations of the context, where the
struct representation is policy-specific.  Also, certain SIDs (e.g. the
kernel SID, the unlabeled SID, etc) are predefined for system
initialization before policy load and to support usage from policy-
independent code, but may be mapped to different context values by
different policies.  So even the SID->string mappings may differ for
different policies, and hence for different namespaces.

> 
> > 
> > * No doubt other things I'm forgetting or haven't thought of.
> > Use at your own risk.
> > 
> > Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
> > ---
> >  security/selinux/include/classmap.h |  3 +-
> >  security/selinux/selinuxfs.c        | 66
> > +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 68 insertions(+), 1 deletion(-)
> > 
> > diff --git a/security/selinux/include/classmap.h
> > b/security/selinux/include/classmap.h
> > index 35ffb29..82c8f9c 100644
> > --- a/security/selinux/include/classmap.h
> > +++ b/security/selinux/include/classmap.h
> > @@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] = {
> >  	  { "compute_av", "compute_create", "compute_member",
> >  	    "check_context", "load_policy", "compute_relabel",
> >  	    "compute_user", "setenforce", "setbool",
> > "setsecparam",
> > -	    "setcheckreqprot", "read_policy", "validate_trans",
> > NULL } },
> > +	    "setcheckreqprot", "read_policy", "validate_trans",
> > "unshare",
> > +	    NULL } },
> >  	{ "process",
> >  	  { "fork", "transition", "sigchld", "sigkill",
> >  	    "sigstop", "signull", "signal", "ptrace", "getsched",
> > "setsched",
> > diff --git a/security/selinux/selinuxfs.c
> > b/security/selinux/selinuxfs.c
> > index a7e6bdb..dedb3cc9 100644
> > --- a/security/selinux/selinuxfs.c
> > +++ b/security/selinux/selinuxfs.c
> > @@ -63,6 +63,7 @@ enum sel_inos {
> >  	SEL_STATUS,	/* export current status using mmap()
> > */
> >  	SEL_POLICY,	/* allow userspace to read the in
> > kernel policy */
> >  	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
> > +	SEL_UNSHARE,	    /* unshare selinux namespace */
> >  	SEL_INO_NEXT,	/* The next inode number to use */
> >  };
> >  
> > @@ -321,6 +322,70 @@ static const struct file_operations
> > sel_disable_ops = {
> >  	.llseek		= generic_file_llseek,
> >  };
> >  
> > +static ssize_t sel_write_unshare(struct file *file, const char
> > __user *buf,
> > +				 size_t count, loff_t *ppos)
> > +
> > +{
> > +	struct selinux_fs_info *fsi = file_inode(file)->i_sb-
> > >s_fs_info;
> > +	struct selinux_ns *ns = fsi->ns;
> > +	char *page;
> > +	ssize_t length;
> > +	bool set;
> > +	int rc;
> > +
> > +	if (count >= PAGE_SIZE)
> > +		return -ENOMEM;
> > +
> > +	/* No partial writes. */
> > +	if (*ppos != 0)
> > +		return -EINVAL;
> > +
> > +	rc = avc_has_perm(current_selinux_ns, current_sid(),
> > +			  SECINITSID_SECURITY, SECCLASS_SECURITY,
> > +			  SECURITY__UNSHARE, NULL);
> > +	if (rc)
> > +		return rc;
> > +
> > +	page = memdup_user_nul(buf, count);
> > +	if (IS_ERR(page))
> > +		return PTR_ERR(page);
> > +
> > +	length = -EINVAL;
> > +	if (kstrtobool(page, &set))
> > +		goto out;
> > +
> > +	if (set) {
> > +		struct cred *cred = prepare_creds();
> > +		struct task_security_struct *tsec;
> > +
> > +		if (!cred) {
> > +			length = -ENOMEM;
> > +			goto out;
> > +		}
> > +		tsec = cred->security;
> > +		if (selinux_ns_create(ns, &tsec->ns)) {
> > +			abort_creds(cred);
> > +			length = -ENOMEM;
> > +			goto out;
> > +		}
> > +		tsec->osid = tsec->sid = SECINITSID_KERNEL;
> > +		tsec->exec_sid = tsec->create_sid = tsec-
> > >keycreate_sid =
> > +			tsec->sockcreate_sid = SECSID_NULL;
> > +		tsec->parent_cred = get_current_cred();
> > +		commit_creds(cred);
> > +	}
> > +
> > +	length = count;
> > +out:
> > +	kfree(page);
> > +	return length;
> > +}
> > +
> > +static const struct file_operations sel_unshare_ops = {
> > +	.write		= sel_write_unshare,
> > +	.llseek		= generic_file_llseek,
> > +};
> > +
> >  static ssize_t sel_read_policyvers(struct file *filp, char __user
> > *buf,
> >  				   size_t count, loff_t *ppos)
> >  {
> > @@ -1923,6 +1988,7 @@ static int sel_fill_super(struct super_block
> > *sb, void *data, int silent)
> >  		[SEL_POLICY] = {"policy", &sel_policy_ops,
> > S_IRUGO},
> >  		[SEL_VALIDATE_TRANS] = {"validatetrans",
> > &sel_transition_ops,
> >  					S_IWUGO},
> > +		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops,
> > 0222},
> >  		/* last one */ {""}
> >  	};
> >  
> 
> 
> .

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-03 12:29     ` Stephen Smalley
@ 2017-10-03 17:14       ` Casey Schaufler
  0 siblings, 0 replies; 39+ messages in thread
From: Casey Schaufler @ 2017-10-03 17:14 UTC (permalink / raw)
  To: Stephen Smalley, selinux, James Morris, Paul Moore

On 10/3/2017 5:29 AM, Stephen Smalley wrote:
> On Mon, 2017-10-02 at 16:56 -0700, Casey Schaufler wrote:
>> On 10/2/2017 8:58 AM, Stephen Smalley wrote:
>>> Provide a userspace API to unshare the selinux namespace.
>>> Currently implemented via a selinuxfs node. This could be
>>> coupled with unsharing of other namespaces (e.g.  mount namespace,
>>> network namespace) that will always be needed or left independent.
>>> Don't get hung up on the interface itself, it is just to allow
>>> experimentation and testing.
>>>
>>> Sample usage:
>>> echo 1 > /sys/fs/selinux/unshare
>>> unshare -m -n
>>> umount /sys/fs/selinux
>>> mount -t selinuxfs none /sys/fs/selinux
>>> load_policy
>>> getenforce
>>> id
>>> echo $$
>>>
>>> The above will show that the process now views itself as running in
>>> the
>>> kernel domain in permissive mode, as would be the case at boot.
>>>> From a different shell on the host system, running ps -eZ or
>>> cat /proc/<pid>/attr/current will show that the process that
>>> unshared its selinux namespace is still running in its original
>>> context in the initial namespace, and getenforce will show the
>>> the initial namespace remains enforcing.  Enforcing mode or policy
>>> changes in the child will not affect the parent.
>>>
>>> This is not yet safe; do not use on production systems.
>>> Known issues include at least the following items:
>>>
>>> * The policy loading code has not been thoroughly audited
>>> and hardened for use by unprivileged code, both with respect to
>>> ensuring that the policy is internally consistent and restricting
>>> the range of values used from the policy as loop bounds and memory
>>> allocation sizes to sane limits.
>>>
>>> * The SELinux hook functions have not been modified to be
>>> namespace-aware, so the hooks only perform checking against the
>>> current namespace.  Thus, unsharing allows the process to escape
>>> confinement by the parent.  Fixing this requires updating each hook
>>> to
>>> perform its processing on the current namespace and all of its
>>> ancestors
>>> up to the init namespace.
>>>
>>> * Some of the hook functions can be called outside of process
>>> context
>>> (e.g. task_kill, send_sigiotask, network input/forward) and should
>>> not use
>>> the current task's selinux namespace. These hooks need to be
>>> updated to
>>> obtain the proper selinux namespace to use instead from the caller
>>> or
>>> cached in a suitable data structure (e.g. the file or sock security
>>> structures).
>>>
>>> * There are number of issues with the inode and superblock security
>>> blob
>>> handling for multiple namespaces, see those commits for more
>>> details.
>>>
>>> * Only a subset of object security blobs have been updated to
>>> be namespace-aware and support multiple namespaces.  The ones that
>>> have not yet been updated could end up performing permission checks
>>> or
>>> other operations on SIDs created in a different selinux namespace.
>>>
>>> * The network SID caches (netif, netnode, netport) have not yet
>>> been instantiated per selinux namespace, unlike the AVC and SS.
>>>
>>> * There is no way currently to restrict or bound nesting of
>>> namespaces; if you allow it to a domain in the init namespace,
>>> then that domain can in turn unshare to arbitrary depths and can
>>> grant the same to any domain in its own policy.  Related to this
>>> is the fact that there is no way to control resource usage due to
>>> selinux namespaces and they can be substantial (per-namespace
>>> policydb, sidtab, AVC, etc).
>>>
>>> * SIDs may be cached by audit and networking code and in external
>>> kernel data structures and used later, potentially in a different
>>> selinux namespace than the one in which the SID was originally
>>> created.
>> Is there a good reason that SIDs (and security contexts) need to
>> be maintained separately in the namespaces? Using the same secid
>> to map to a different context depending on the namespace seems like
>> you're asking for trouble you don't need. A namespace that hasn't
>> policy for a context/SID won't use it if it is defined for a
>> different namespace, and should detect the fact if it somehow gets
>> referenced, because it isn't in the policy.
>>
>> Do the context/SID mapping globally. Or, if you must duplicate
>> contexts,
>> allocate the SIDs from a single source so that they aren't ambiguous.
> Yes, that may be the right answer, but it requires introducing a new
> SID/context layer above the current security server layer (or changing
> the latter),

It could be as simple as using a global next_sid instead of attaching
that to the sidtab (s->next_sid).

What I'm really afraid of is an ambiguous secid being set in
a secmark. In the stacking case it's going to be bad enough
mapping a set of secids into a "token" without having to deal
with the possibility that SELinux could give you a secid that
has multiple possible mappings depending on which namespace it
came from and/or which it is going into.


>  because the security server SID/context mappings are from
> SIDs to internal struct representations of the context, where the
> struct representation is policy-specific.  Also, certain SIDs (e.g. the
> kernel SID, the unlabeled SID, etc) are predefined for system
> initialization before policy load and to support usage from policy-
> independent code, but may be mapped to different context values by
> different policies.  So even the SID->string mappings may differ for
> different policies, and hence for different namespaces.
>
>>> * No doubt other things I'm forgetting or haven't thought of.
>>> Use at your own risk.
>>>
>>> Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
>>> ---
>>>  security/selinux/include/classmap.h |  3 +-
>>>  security/selinux/selinuxfs.c        | 66
>>> +++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 68 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/security/selinux/include/classmap.h
>>> b/security/selinux/include/classmap.h
>>> index 35ffb29..82c8f9c 100644
>>> --- a/security/selinux/include/classmap.h
>>> +++ b/security/selinux/include/classmap.h
>>> @@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] = {
>>>  	  { "compute_av", "compute_create", "compute_member",
>>>  	    "check_context", "load_policy", "compute_relabel",
>>>  	    "compute_user", "setenforce", "setbool",
>>> "setsecparam",
>>> -	    "setcheckreqprot", "read_policy", "validate_trans",
>>> NULL } },
>>> +	    "setcheckreqprot", "read_policy", "validate_trans",
>>> "unshare",
>>> +	    NULL } },
>>>  	{ "process",
>>>  	  { "fork", "transition", "sigchld", "sigkill",
>>>  	    "sigstop", "signull", "signal", "ptrace", "getsched",
>>> "setsched",
>>> diff --git a/security/selinux/selinuxfs.c
>>> b/security/selinux/selinuxfs.c
>>> index a7e6bdb..dedb3cc9 100644
>>> --- a/security/selinux/selinuxfs.c
>>> +++ b/security/selinux/selinuxfs.c
>>> @@ -63,6 +63,7 @@ enum sel_inos {
>>>  	SEL_STATUS,	/* export current status using mmap()
>>> */
>>>  	SEL_POLICY,	/* allow userspace to read the in
>>> kernel policy */
>>>  	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
>>> +	SEL_UNSHARE,	    /* unshare selinux namespace */
>>>  	SEL_INO_NEXT,	/* The next inode number to use */
>>>  };
>>>  
>>> @@ -321,6 +322,70 @@ static const struct file_operations
>>> sel_disable_ops = {
>>>  	.llseek		= generic_file_llseek,
>>>  };
>>>  
>>> +static ssize_t sel_write_unshare(struct file *file, const char
>>> __user *buf,
>>> +				 size_t count, loff_t *ppos)
>>> +
>>> +{
>>> +	struct selinux_fs_info *fsi = file_inode(file)->i_sb-
>>>> s_fs_info;
>>> +	struct selinux_ns *ns = fsi->ns;
>>> +	char *page;
>>> +	ssize_t length;
>>> +	bool set;
>>> +	int rc;
>>> +
>>> +	if (count >= PAGE_SIZE)
>>> +		return -ENOMEM;
>>> +
>>> +	/* No partial writes. */
>>> +	if (*ppos != 0)
>>> +		return -EINVAL;
>>> +
>>> +	rc = avc_has_perm(current_selinux_ns, current_sid(),
>>> +			  SECINITSID_SECURITY, SECCLASS_SECURITY,
>>> +			  SECURITY__UNSHARE, NULL);
>>> +	if (rc)
>>> +		return rc;
>>> +
>>> +	page = memdup_user_nul(buf, count);
>>> +	if (IS_ERR(page))
>>> +		return PTR_ERR(page);
>>> +
>>> +	length = -EINVAL;
>>> +	if (kstrtobool(page, &set))
>>> +		goto out;
>>> +
>>> +	if (set) {
>>> +		struct cred *cred = prepare_creds();
>>> +		struct task_security_struct *tsec;
>>> +
>>> +		if (!cred) {
>>> +			length = -ENOMEM;
>>> +			goto out;
>>> +		}
>>> +		tsec = cred->security;
>>> +		if (selinux_ns_create(ns, &tsec->ns)) {
>>> +			abort_creds(cred);
>>> +			length = -ENOMEM;
>>> +			goto out;
>>> +		}
>>> +		tsec->osid = tsec->sid = SECINITSID_KERNEL;
>>> +		tsec->exec_sid = tsec->create_sid = tsec-
>>>> keycreate_sid =
>>> +			tsec->sockcreate_sid = SECSID_NULL;
>>> +		tsec->parent_cred = get_current_cred();
>>> +		commit_creds(cred);
>>> +	}
>>> +
>>> +	length = count;
>>> +out:
>>> +	kfree(page);
>>> +	return length;
>>> +}
>>> +
>>> +static const struct file_operations sel_unshare_ops = {
>>> +	.write		= sel_write_unshare,
>>> +	.llseek		= generic_file_llseek,
>>> +};
>>> +
>>>  static ssize_t sel_read_policyvers(struct file *filp, char __user
>>> *buf,
>>>  				   size_t count, loff_t *ppos)
>>>  {
>>> @@ -1923,6 +1988,7 @@ static int sel_fill_super(struct super_block
>>> *sb, void *data, int silent)
>>>  		[SEL_POLICY] = {"policy", &sel_policy_ops,
>>> S_IRUGO},
>>>  		[SEL_VALIDATE_TRANS] = {"validatetrans",
>>> &sel_transition_ops,
>>>  					S_IWUGO},
>>> +		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops,
>>> 0222},
>>>  		/* last one */ {""}
>>>  	};
>>>  
>>
>> .


.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-02 15:58 ` [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace Stephen Smalley
@ 2017-10-05  5:47   ` Serge E. Hallyn
  2017-10-05 14:06     ` Stephen Smalley
  2017-10-06  1:07   ` James Morris
  1 sibling, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2017-10-05  5:47 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux

On Mon, Oct 02, 2017 at 11:58:19AM -0400, Stephen Smalley wrote:
> The selinux netlink socket is used to notify userspace of changes to
> the enforcing mode and policy reloads.  At present, these notifications
> are always sent to the initial network namespace.  In order to support
> multiple selinux namespaces, each with its own enforcing mode and
> policy, we need to create and use a separate selinux netlink socket
> for each network namespace.

...

> +static int __init selnl_init(void)
> +{
> +	if (register_pernet_subsys(&selnl_net_ops))
> +		panic("Could not register selinux netlink operations\n");
>  	return 0;
>  }

This doesn't seem right to me.  If the socket is only used to send
notifications to userspace, then every net_ns doesn't need a socket,
only the first netns that the selinux ns was associated, right?

So long as there is a way to find the netns to which an selinux ns
is tied, a userspace logger could even setns into that netns to listen
for updates, if it wasn't certain to be in the right ns when it ran.

Otherwise (I haven't peeked ahead) you'll have to keep the *list* of
net_ns which live in a given selinuxfs and copy all messages to all of
those namesapces?

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-05  5:47   ` Serge E. Hallyn
@ 2017-10-05 14:06     ` Stephen Smalley
  2017-10-05 14:11       ` Stephen Smalley
  2017-10-29  3:16       ` Serge E. Hallyn
  0 siblings, 2 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-05 14:06 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: selinux, James Morris, Paul Moore

On Thu, 2017-10-05 at 00:47 -0500, Serge E. Hallyn wrote:
> On Mon, Oct 02, 2017 at 11:58:19AM -0400, Stephen Smalley wrote:
> > The selinux netlink socket is used to notify userspace of changes
> > to
> > the enforcing mode and policy reloads.  At present, these
> > notifications
> > are always sent to the initial network namespace.  In order to
> > support
> > multiple selinux namespaces, each with its own enforcing mode and
> > policy, we need to create and use a separate selinux netlink socket
> > for each network namespace.
> 
> ...
> 
> > +static int __init selnl_init(void)
> > +{
> > +	if (register_pernet_subsys(&selnl_net_ops))
> > +		panic("Could not register selinux netlink
> > operations\n");
> >  	return 0;
> >  }
> 
> This doesn't seem right to me.  If the socket is only used to send
> notifications to userspace, then every net_ns doesn't need a socket,
> only the first netns that the selinux ns was associated, right?

What does "the first netns that the selinux ns was associated" mean?
We could unshare them in any order; in the sample command sequence I
gave in the patch description for "selinux: add a selinuxfs interface
to unshare selinux namespace", I unshared the SELinux namespace first,
then the network namespace, but we could just as easily do it in the
reverse order (or at the same time if unshare(2) supported that).  So
you can't assume that the network namespace in which you are running at
the time you unshare selinux namespace is the right one, nor that the
first unshare of the network namespace after unsharing the selinux
namespace is the right one (not that we even have a way to catch that
currently).

> So long as there is a way to find the netns to which an selinux ns
> is tied, a userspace logger could even setns into that netns to
> listen
> for updates, if it wasn't certain to be in the right ns when it ran.
> 
> Otherwise (I haven't peeked ahead) you'll have to keep the *list* of
> net_ns which live in a given selinuxfs and copy all messages to all
> of
> those namesapces?

No, we only deliver to the network namespace of the process that
performed the setenforce or policy load (most commonly init, could also
be an admin running a management command or installing a policy rpm). 
We assume the container runtime properly handles unsharing of the
mount, network, and selinux namespaces before launching the container
init.  A container process that subsequently unshares its network
namespace won't see notifications for any subsequent policy reloads or
setenforce calls.  I don't know if that will prove to be a problem in
practice.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-05 14:06     ` Stephen Smalley
@ 2017-10-05 14:11       ` Stephen Smalley
  2017-10-29  3:16       ` Serge E. Hallyn
  1 sibling, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-05 14:11 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: selinux, James Morris, Paul Moore

On Thu, 2017-10-05 at 10:06 -0400, Stephen Smalley wrote:
> On Thu, 2017-10-05 at 00:47 -0500, Serge E. Hallyn wrote:
> > On Mon, Oct 02, 2017 at 11:58:19AM -0400, Stephen Smalley wrote:
> > > The selinux netlink socket is used to notify userspace of changes
> > > to
> > > the enforcing mode and policy reloads.  At present, these
> > > notifications
> > > are always sent to the initial network namespace.  In order to
> > > support
> > > multiple selinux namespaces, each with its own enforcing mode and
> > > policy, we need to create and use a separate selinux netlink
> > > socket
> > > for each network namespace.
> > 
> > ...
> > 
> > > +static int __init selnl_init(void)
> > > +{
> > > +	if (register_pernet_subsys(&selnl_net_ops))
> > > +		panic("Could not register selinux netlink
> > > operations\n");
> > >  	return 0;
> > >  }
> > 
> > This doesn't seem right to me.  If the socket is only used to send
> > notifications to userspace, then every net_ns doesn't need a
> > socket,
> > only the first netns that the selinux ns was associated, right?
> 
> What does "the first netns that the selinux ns was associated" mean?
> We could unshare them in any order; in the sample command sequence I
> gave in the patch description for "selinux: add a selinuxfs interface
> to unshare selinux namespace", I unshared the SELinux namespace
> first,
> then the network namespace, but we could just as easily do it in the
> reverse order (or at the same time if unshare(2) supported that).  So
> you can't assume that the network namespace in which you are running
> at
> the time you unshare selinux namespace is the right one, nor that the
> first unshare of the network namespace after unsharing the selinux
> namespace is the right one (not that we even have a way to catch that
> currently).
> 
> > So long as there is a way to find the netns to which an selinux ns
> > is tied, a userspace logger could even setns into that netns to
> > listen
> > for updates, if it wasn't certain to be in the right ns when it
> > ran.
> > 
> > Otherwise (I haven't peeked ahead) you'll have to keep the *list*
> > of
> > net_ns which live in a given selinuxfs and copy all messages to all
> > of
> > those namesapces?
> 
> No, we only deliver to the network namespace of the process that
> performed the setenforce or policy load (most commonly init, could
> also
> be an admin running a management command or installing a policy
> rpm). 
> We assume the container runtime properly handles unsharing of the
> mount, network, and selinux namespaces before launching the container
> init.  A container process that subsequently unshares its network
> namespace won't see notifications for any subsequent policy reloads
> or
> setenforce calls.  I don't know if that will prove to be a problem in
> practice.

It should be noted however that this wouldn't be a regression, since
today the netlink notifications are only delivered to the init network
namespace.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-02 15:58 ` [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
  2017-10-02 23:56   ` Casey Schaufler
@ 2017-10-05 15:27   ` Stephen Smalley
  2017-10-05 15:49     ` Stephen Smalley
  2017-10-09  1:52     ` James Morris
  1 sibling, 2 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-05 15:27 UTC (permalink / raw)
  To: selinux

On Mon, 2017-10-02 at 11:58 -0400, Stephen Smalley wrote:
> Provide a userspace API to unshare the selinux namespace.
> Currently implemented via a selinuxfs node. This could be
> coupled with unsharing of other namespaces (e.g.  mount namespace,
> network namespace) that will always be needed or left independent.
> Don't get hung up on the interface itself, it is just to allow
> experimentation and testing.
> 
> Sample usage:
> echo 1 > /sys/fs/selinux/unshare
> unshare -m -n
> umount /sys/fs/selinux
> mount -t selinuxfs none /sys/fs/selinux
> load_policy
> getenforce
> id
> echo $$

For added fun, you can do the following after unsharing and loading a
policy into your namespace above:
# Transition from kernel context to an unconfined context.
runcon unconfined_u:unconfined_u:unconfined_t:s0:c0.c1023 /bin/bash
# Allow use of file descriptors inherited from the parent namespace, e.g the pty.
cat <<EOF > allowunlabeledfd.cil
(allow domain unlabeled_t (fd (use)))
EOF
semodule -i allowunlabeledfd.cil
# Switch namespace to enforcing mode
setenforce 1
# Run the selinux testsuite
cd /path/to/selinux-testsuite
make test

inet_socket test failures are expected due to running in a non-init
network namespace; they don't work even without unsharing the selinux
namespace.

> 
> The above will show that the process now views itself as running in
> the
> kernel domain in permissive mode, as would be the case at boot.
> > From a different shell on the host system, running ps -eZ or
> 
> cat /proc/<pid>/attr/current will show that the process that
> unshared its selinux namespace is still running in its original
> context in the initial namespace, and getenforce will show the
> the initial namespace remains enforcing.  Enforcing mode or policy
> changes in the child will not affect the parent.
> 
> This is not yet safe; do not use on production systems.
> Known issues include at least the following items:
> 
> * The policy loading code has not been thoroughly audited
> and hardened for use by unprivileged code, both with respect to
> ensuring that the policy is internally consistent and restricting
> the range of values used from the policy as loop bounds and memory
> allocation sizes to sane limits.
> 
> * The SELinux hook functions have not been modified to be
> namespace-aware, so the hooks only perform checking against the
> current namespace.  Thus, unsharing allows the process to escape
> confinement by the parent.  Fixing this requires updating each hook
> to
> perform its processing on the current namespace and all of its
> ancestors
> up to the init namespace.
> 
> * Some of the hook functions can be called outside of process context
> (e.g. task_kill, send_sigiotask, network input/forward) and should
> not use
> the current task's selinux namespace. These hooks need to be updated
> to
> obtain the proper selinux namespace to use instead from the caller or
> cached in a suitable data structure (e.g. the file or sock security
> structures).
> 
> * There are number of issues with the inode and superblock security
> blob
> handling for multiple namespaces, see those commits for more details.
> 
> * Only a subset of object security blobs have been updated to
> be namespace-aware and support multiple namespaces.  The ones that
> have not yet been updated could end up performing permission checks
> or
> other operations on SIDs created in a different selinux namespace.
> 
> * The network SID caches (netif, netnode, netport) have not yet
> been instantiated per selinux namespace, unlike the AVC and SS.
> 
> * There is no way currently to restrict or bound nesting of
> namespaces; if you allow it to a domain in the init namespace,
> then that domain can in turn unshare to arbitrary depths and can
> grant the same to any domain in its own policy.  Related to this
> is the fact that there is no way to control resource usage due to
> selinux namespaces and they can be substantial (per-namespace
> policydb, sidtab, AVC, etc).
> 
> * SIDs may be cached by audit and networking code and in external
> kernel data structures and used later, potentially in a different
> selinux namespace than the one in which the SID was originally
> created.
> 
> * No doubt other things I'm forgetting or haven't thought of.
> Use at your own risk.
> 
> Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
> ---
>  security/selinux/include/classmap.h |  3 +-
>  security/selinux/selinuxfs.c        | 66
> +++++++++++++++++++++++++++++++++++++
>  2 files changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/security/selinux/include/classmap.h
> b/security/selinux/include/classmap.h
> index 35ffb29..82c8f9c 100644
> --- a/security/selinux/include/classmap.h
> +++ b/security/selinux/include/classmap.h
> @@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] = {
>  	  { "compute_av", "compute_create", "compute_member",
>  	    "check_context", "load_policy", "compute_relabel",
>  	    "compute_user", "setenforce", "setbool", "setsecparam",
> -	    "setcheckreqprot", "read_policy", "validate_trans", NULL
> } },
> +	    "setcheckreqprot", "read_policy", "validate_trans",
> "unshare",
> +	    NULL } },
>  	{ "process",
>  	  { "fork", "transition", "sigchld", "sigkill",
>  	    "sigstop", "signull", "signal", "ptrace", "getsched",
> "setsched",
> diff --git a/security/selinux/selinuxfs.c
> b/security/selinux/selinuxfs.c
> index a7e6bdb..dedb3cc9 100644
> --- a/security/selinux/selinuxfs.c
> +++ b/security/selinux/selinuxfs.c
> @@ -63,6 +63,7 @@ enum sel_inos {
>  	SEL_STATUS,	/* export current status using mmap() */
>  	SEL_POLICY,	/* allow userspace to read the in kernel
> policy */
>  	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
> +	SEL_UNSHARE,	    /* unshare selinux namespace */
>  	SEL_INO_NEXT,	/* The next inode number to use */
>  };
>  
> @@ -321,6 +322,70 @@ static const struct file_operations
> sel_disable_ops = {
>  	.llseek		= generic_file_llseek,
>  };
>  
> +static ssize_t sel_write_unshare(struct file *file, const char
> __user *buf,
> +				 size_t count, loff_t *ppos)
> +
> +{
> +	struct selinux_fs_info *fsi = file_inode(file)->i_sb-
> >s_fs_info;
> +	struct selinux_ns *ns = fsi->ns;
> +	char *page;
> +	ssize_t length;
> +	bool set;
> +	int rc;
> +
> +	if (count >= PAGE_SIZE)
> +		return -ENOMEM;
> +
> +	/* No partial writes. */
> +	if (*ppos != 0)
> +		return -EINVAL;
> +
> +	rc = avc_has_perm(current_selinux_ns, current_sid(),
> +			  SECINITSID_SECURITY, SECCLASS_SECURITY,
> +			  SECURITY__UNSHARE, NULL);
> +	if (rc)
> +		return rc;
> +
> +	page = memdup_user_nul(buf, count);
> +	if (IS_ERR(page))
> +		return PTR_ERR(page);
> +
> +	length = -EINVAL;
> +	if (kstrtobool(page, &set))
> +		goto out;
> +
> +	if (set) {
> +		struct cred *cred = prepare_creds();
> +		struct task_security_struct *tsec;
> +
> +		if (!cred) {
> +			length = -ENOMEM;
> +			goto out;
> +		}
> +		tsec = cred->security;
> +		if (selinux_ns_create(ns, &tsec->ns)) {
> +			abort_creds(cred);
> +			length = -ENOMEM;
> +			goto out;
> +		}
> +		tsec->osid = tsec->sid = SECINITSID_KERNEL;
> +		tsec->exec_sid = tsec->create_sid = tsec-
> >keycreate_sid =
> +			tsec->sockcreate_sid = SECSID_NULL;
> +		tsec->parent_cred = get_current_cred();
> +		commit_creds(cred);
> +	}
> +
> +	length = count;
> +out:
> +	kfree(page);
> +	return length;
> +}
> +
> +static const struct file_operations sel_unshare_ops = {
> +	.write		= sel_write_unshare,
> +	.llseek		= generic_file_llseek,
> +};
> +
>  static ssize_t sel_read_policyvers(struct file *filp, char __user
> *buf,
>  				   size_t count, loff_t *ppos)
>  {
> @@ -1923,6 +1988,7 @@ static int sel_fill_super(struct super_block
> *sb, void *data, int silent)
>  		[SEL_POLICY] = {"policy", &sel_policy_ops, S_IRUGO},
>  		[SEL_VALIDATE_TRANS] = {"validatetrans",
> &sel_transition_ops,
>  					S_IWUGO},
> +		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops, 0222},
>  		/* last one */ {""}
>  	};
>  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-05 15:27   ` Stephen Smalley
@ 2017-10-05 15:49     ` Stephen Smalley
  2017-10-05 17:04       ` Stephen Smalley
  2017-10-09  1:52     ` James Morris
  1 sibling, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-05 15:49 UTC (permalink / raw)
  To: selinux

On Thu, 2017-10-05 at 11:27 -0400, Stephen Smalley wrote:
> On Mon, 2017-10-02 at 11:58 -0400, Stephen Smalley wrote:
> > Provide a userspace API to unshare the selinux namespace.
> > Currently implemented via a selinuxfs node. This could be
> > coupled with unsharing of other namespaces (e.g.  mount namespace,
> > network namespace) that will always be needed or left independent.
> > Don't get hung up on the interface itself, it is just to allow
> > experimentation and testing.
> > 
> > Sample usage:
> > echo 1 > /sys/fs/selinux/unshare
> > unshare -m -n
> > umount /sys/fs/selinux
> > mount -t selinuxfs none /sys/fs/selinux
> > load_policy
> > getenforce
> > id
> > echo $$
> 
> For added fun, you can do the following after unsharing and loading a
> policy into your namespace above:
> # Transition from kernel context to an unconfined context.
> runcon unconfined_u:unconfined_u:unconfined_t:s0:c0.c1023 /bin/bash
> # Allow use of file descriptors inherited from the parent namespace,
> e.g the pty.
> cat <<EOF > allowunlabeledfd.cil
> (allow domain unlabeled_t (fd (use)))
> EOF
> semodule -i allowunlabeledfd.cil

Also:
restorecon -R /dev
to fix up the /dev node contexts in your namespace.

> # Switch namespace to enforcing mode
> setenforce 1
> # Run the selinux testsuite
> cd /path/to/selinux-testsuite
> make test
> 
> inet_socket test failures are expected due to running in a non-init
> network namespace; they don't work even without unsharing the selinux
> namespace.
> 
> > 
> > The above will show that the process now views itself as running in
> > the
> > kernel domain in permissive mode, as would be the case at boot.
> > > From a different shell on the host system, running ps -eZ or
> > 
> > cat /proc/<pid>/attr/current will show that the process that
> > unshared its selinux namespace is still running in its original
> > context in the initial namespace, and getenforce will show the
> > the initial namespace remains enforcing.  Enforcing mode or policy
> > changes in the child will not affect the parent.
> > 
> > This is not yet safe; do not use on production systems.
> > Known issues include at least the following items:
> > 
> > * The policy loading code has not been thoroughly audited
> > and hardened for use by unprivileged code, both with respect to
> > ensuring that the policy is internally consistent and restricting
> > the range of values used from the policy as loop bounds and memory
> > allocation sizes to sane limits.
> > 
> > * The SELinux hook functions have not been modified to be
> > namespace-aware, so the hooks only perform checking against the
> > current namespace.  Thus, unsharing allows the process to escape
> > confinement by the parent.  Fixing this requires updating each hook
> > to
> > perform its processing on the current namespace and all of its
> > ancestors
> > up to the init namespace.
> > 
> > * Some of the hook functions can be called outside of process
> > context
> > (e.g. task_kill, send_sigiotask, network input/forward) and should
> > not use
> > the current task's selinux namespace. These hooks need to be
> > updated
> > to
> > obtain the proper selinux namespace to use instead from the caller
> > or
> > cached in a suitable data structure (e.g. the file or sock security
> > structures).
> > 
> > * There are number of issues with the inode and superblock security
> > blob
> > handling for multiple namespaces, see those commits for more
> > details.
> > 
> > * Only a subset of object security blobs have been updated to
> > be namespace-aware and support multiple namespaces.  The ones that
> > have not yet been updated could end up performing permission checks
> > or
> > other operations on SIDs created in a different selinux namespace.
> > 
> > * The network SID caches (netif, netnode, netport) have not yet
> > been instantiated per selinux namespace, unlike the AVC and SS.
> > 
> > * There is no way currently to restrict or bound nesting of
> > namespaces; if you allow it to a domain in the init namespace,
> > then that domain can in turn unshare to arbitrary depths and can
> > grant the same to any domain in its own policy.  Related to this
> > is the fact that there is no way to control resource usage due to
> > selinux namespaces and they can be substantial (per-namespace
> > policydb, sidtab, AVC, etc).
> > 
> > * SIDs may be cached by audit and networking code and in external
> > kernel data structures and used later, potentially in a different
> > selinux namespace than the one in which the SID was originally
> > created.
> > 
> > * No doubt other things I'm forgetting or haven't thought of.
> > Use at your own risk.
> > 
> > Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
> > ---
> >  security/selinux/include/classmap.h |  3 +-
> >  security/selinux/selinuxfs.c        | 66
> > +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 68 insertions(+), 1 deletion(-)
> > 
> > diff --git a/security/selinux/include/classmap.h
> > b/security/selinux/include/classmap.h
> > index 35ffb29..82c8f9c 100644
> > --- a/security/selinux/include/classmap.h
> > +++ b/security/selinux/include/classmap.h
> > @@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] = {
> >  	  { "compute_av", "compute_create", "compute_member",
> >  	    "check_context", "load_policy", "compute_relabel",
> >  	    "compute_user", "setenforce", "setbool",
> > "setsecparam",
> > -	    "setcheckreqprot", "read_policy", "validate_trans",
> > NULL
> > } },
> > +	    "setcheckreqprot", "read_policy", "validate_trans",
> > "unshare",
> > +	    NULL } },
> >  	{ "process",
> >  	  { "fork", "transition", "sigchld", "sigkill",
> >  	    "sigstop", "signull", "signal", "ptrace", "getsched",
> > "setsched",
> > diff --git a/security/selinux/selinuxfs.c
> > b/security/selinux/selinuxfs.c
> > index a7e6bdb..dedb3cc9 100644
> > --- a/security/selinux/selinuxfs.c
> > +++ b/security/selinux/selinuxfs.c
> > @@ -63,6 +63,7 @@ enum sel_inos {
> >  	SEL_STATUS,	/* export current status using mmap()
> > */
> >  	SEL_POLICY,	/* allow userspace to read the in
> > kernel
> > policy */
> >  	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
> > +	SEL_UNSHARE,	    /* unshare selinux namespace */
> >  	SEL_INO_NEXT,	/* The next inode number to use */
> >  };
> >  
> > @@ -321,6 +322,70 @@ static const struct file_operations
> > sel_disable_ops = {
> >  	.llseek		= generic_file_llseek,
> >  };
> >  
> > +static ssize_t sel_write_unshare(struct file *file, const char
> > __user *buf,
> > +				 size_t count, loff_t *ppos)
> > +
> > +{
> > +	struct selinux_fs_info *fsi = file_inode(file)->i_sb-
> > > s_fs_info;
> > 
> > +	struct selinux_ns *ns = fsi->ns;
> > +	char *page;
> > +	ssize_t length;
> > +	bool set;
> > +	int rc;
> > +
> > +	if (count >= PAGE_SIZE)
> > +		return -ENOMEM;
> > +
> > +	/* No partial writes. */
> > +	if (*ppos != 0)
> > +		return -EINVAL;
> > +
> > +	rc = avc_has_perm(current_selinux_ns, current_sid(),
> > +			  SECINITSID_SECURITY, SECCLASS_SECURITY,
> > +			  SECURITY__UNSHARE, NULL);
> > +	if (rc)
> > +		return rc;
> > +
> > +	page = memdup_user_nul(buf, count);
> > +	if (IS_ERR(page))
> > +		return PTR_ERR(page);
> > +
> > +	length = -EINVAL;
> > +	if (kstrtobool(page, &set))
> > +		goto out;
> > +
> > +	if (set) {
> > +		struct cred *cred = prepare_creds();
> > +		struct task_security_struct *tsec;
> > +
> > +		if (!cred) {
> > +			length = -ENOMEM;
> > +			goto out;
> > +		}
> > +		tsec = cred->security;
> > +		if (selinux_ns_create(ns, &tsec->ns)) {
> > +			abort_creds(cred);
> > +			length = -ENOMEM;
> > +			goto out;
> > +		}
> > +		tsec->osid = tsec->sid = SECINITSID_KERNEL;
> > +		tsec->exec_sid = tsec->create_sid = tsec-
> > > keycreate_sid =
> > 
> > +			tsec->sockcreate_sid = SECSID_NULL;
> > +		tsec->parent_cred = get_current_cred();
> > +		commit_creds(cred);
> > +	}
> > +
> > +	length = count;
> > +out:
> > +	kfree(page);
> > +	return length;
> > +}
> > +
> > +static const struct file_operations sel_unshare_ops = {
> > +	.write		= sel_write_unshare,
> > +	.llseek		= generic_file_llseek,
> > +};
> > +
> >  static ssize_t sel_read_policyvers(struct file *filp, char __user
> > *buf,
> >  				   size_t count, loff_t *ppos)
> >  {
> > @@ -1923,6 +1988,7 @@ static int sel_fill_super(struct super_block
> > *sb, void *data, int silent)
> >  		[SEL_POLICY] = {"policy", &sel_policy_ops,
> > S_IRUGO},
> >  		[SEL_VALIDATE_TRANS] = {"validatetrans",
> > &sel_transition_ops,
> >  					S_IWUGO},
> > +		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops,
> > 0222},
> >  		/* last one */ {""}
> >  	};
> >  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-05 15:49     ` Stephen Smalley
@ 2017-10-05 17:04       ` Stephen Smalley
  0 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-05 17:04 UTC (permalink / raw)
  To: selinux

On Thu, 2017-10-05 at 11:49 -0400, Stephen Smalley wrote:
> On Thu, 2017-10-05 at 11:27 -0400, Stephen Smalley wrote:
> > On Mon, 2017-10-02 at 11:58 -0400, Stephen Smalley wrote:
> > > Provide a userspace API to unshare the selinux namespace.
> > > Currently implemented via a selinuxfs node. This could be
> > > coupled with unsharing of other namespaces (e.g.  mount
> > > namespace,
> > > network namespace) that will always be needed or left
> > > independent.
> > > Don't get hung up on the interface itself, it is just to allow
> > > experimentation and testing.
> > > 
> > > Sample usage:
> > > echo 1 > /sys/fs/selinux/unshare
> > > unshare -m -n
> > > umount /sys/fs/selinux
> > > mount -t selinuxfs none /sys/fs/selinux
> > > load_policy
> > > getenforce
> > > id
> > > echo $$
> > 
> > For added fun, you can do the following after unsharing and loading
> > a
> > policy into your namespace above:
> > # Transition from kernel context to an unconfined context.
> > runcon unconfined_u:unconfined_u:unconfined_t:s0:c0.c1023 /bin/bash

That should be:
runcon unconfined_u:unconfined_r:unconfined_t:s0:c0.c1023 /bin/bash


> > # Allow use of file descriptors inherited from the parent
> > namespace,
> > e.g the pty.
> > cat <<EOF > allowunlabeledfd.cil
> > (allow domain unlabeled_t (fd (use)))
> > EOF
> > semodule -i allowunlabeledfd.cil
> 
> Also:
> restorecon -R /dev
> to fix up the /dev node contexts in your namespace.
> 
> > # Switch namespace to enforcing mode
> > setenforce 1
> > # Run the selinux testsuite
> > cd /path/to/selinux-testsuite
> > make test
> > 
> > inet_socket test failures are expected due to running in a non-init
> > network namespace; they don't work even without unsharing the
> > selinux
> > namespace.
> > 
> > > 
> > > The above will show that the process now views itself as running
> > > in
> > > the
> > > kernel domain in permissive mode, as would be the case at boot.
> > > > From a different shell on the host system, running ps -eZ or
> > > 
> > > cat /proc/<pid>/attr/current will show that the process that
> > > unshared its selinux namespace is still running in its original
> > > context in the initial namespace, and getenforce will show the
> > > the initial namespace remains enforcing.  Enforcing mode or
> > > policy
> > > changes in the child will not affect the parent.
> > > 
> > > This is not yet safe; do not use on production systems.
> > > Known issues include at least the following items:
> > > 
> > > * The policy loading code has not been thoroughly audited
> > > and hardened for use by unprivileged code, both with respect to
> > > ensuring that the policy is internally consistent and restricting
> > > the range of values used from the policy as loop bounds and
> > > memory
> > > allocation sizes to sane limits.
> > > 
> > > * The SELinux hook functions have not been modified to be
> > > namespace-aware, so the hooks only perform checking against the
> > > current namespace.  Thus, unsharing allows the process to escape
> > > confinement by the parent.  Fixing this requires updating each
> > > hook
> > > to
> > > perform its processing on the current namespace and all of its
> > > ancestors
> > > up to the init namespace.
> > > 
> > > * Some of the hook functions can be called outside of process
> > > context
> > > (e.g. task_kill, send_sigiotask, network input/forward) and
> > > should
> > > not use
> > > the current task's selinux namespace. These hooks need to be
> > > updated
> > > to
> > > obtain the proper selinux namespace to use instead from the
> > > caller
> > > or
> > > cached in a suitable data structure (e.g. the file or sock
> > > security
> > > structures).
> > > 
> > > * There are number of issues with the inode and superblock
> > > security
> > > blob
> > > handling for multiple namespaces, see those commits for more
> > > details.
> > > 
> > > * Only a subset of object security blobs have been updated to
> > > be namespace-aware and support multiple namespaces.  The ones
> > > that
> > > have not yet been updated could end up performing permission
> > > checks
> > > or
> > > other operations on SIDs created in a different selinux
> > > namespace.
> > > 
> > > * The network SID caches (netif, netnode, netport) have not yet
> > > been instantiated per selinux namespace, unlike the AVC and SS.
> > > 
> > > * There is no way currently to restrict or bound nesting of
> > > namespaces; if you allow it to a domain in the init namespace,
> > > then that domain can in turn unshare to arbitrary depths and can
> > > grant the same to any domain in its own policy.  Related to this
> > > is the fact that there is no way to control resource usage due to
> > > selinux namespaces and they can be substantial (per-namespace
> > > policydb, sidtab, AVC, etc).
> > > 
> > > * SIDs may be cached by audit and networking code and in external
> > > kernel data structures and used later, potentially in a different
> > > selinux namespace than the one in which the SID was originally
> > > created.
> > > 
> > > * No doubt other things I'm forgetting or haven't thought of.
> > > Use at your own risk.
> > > 
> > > Not-signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
> > > ---
> > >  security/selinux/include/classmap.h |  3 +-
> > >  security/selinux/selinuxfs.c        | 66
> > > +++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 68 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/security/selinux/include/classmap.h
> > > b/security/selinux/include/classmap.h
> > > index 35ffb29..82c8f9c 100644
> > > --- a/security/selinux/include/classmap.h
> > > +++ b/security/selinux/include/classmap.h
> > > @@ -39,7 +39,8 @@ struct security_class_mapping secclass_map[] =
> > > {
> > >  	  { "compute_av", "compute_create", "compute_member",
> > >  	    "check_context", "load_policy", "compute_relabel",
> > >  	    "compute_user", "setenforce", "setbool",
> > > "setsecparam",
> > > -	    "setcheckreqprot", "read_policy", "validate_trans",
> > > NULL
> > > } },
> > > +	    "setcheckreqprot", "read_policy", "validate_trans",
> > > "unshare",
> > > +	    NULL } },
> > >  	{ "process",
> > >  	  { "fork", "transition", "sigchld", "sigkill",
> > >  	    "sigstop", "signull", "signal", "ptrace",
> > > "getsched",
> > > "setsched",
> > > diff --git a/security/selinux/selinuxfs.c
> > > b/security/selinux/selinuxfs.c
> > > index a7e6bdb..dedb3cc9 100644
> > > --- a/security/selinux/selinuxfs.c
> > > +++ b/security/selinux/selinuxfs.c
> > > @@ -63,6 +63,7 @@ enum sel_inos {
> > >  	SEL_STATUS,	/* export current status using mmap()
> > > */
> > >  	SEL_POLICY,	/* allow userspace to read the in
> > > kernel
> > > policy */
> > >  	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
> > > +	SEL_UNSHARE,	    /* unshare selinux namespace */
> > >  	SEL_INO_NEXT,	/* The next inode number to use */
> > >  };
> > >  
> > > @@ -321,6 +322,70 @@ static const struct file_operations
> > > sel_disable_ops = {
> > >  	.llseek		= generic_file_llseek,
> > >  };
> > >  
> > > +static ssize_t sel_write_unshare(struct file *file, const char
> > > __user *buf,
> > > +				 size_t count, loff_t *ppos)
> > > +
> > > +{
> > > +	struct selinux_fs_info *fsi = file_inode(file)->i_sb-
> > > > s_fs_info;
> > > 
> > > +	struct selinux_ns *ns = fsi->ns;
> > > +	char *page;
> > > +	ssize_t length;
> > > +	bool set;
> > > +	int rc;
> > > +
> > > +	if (count >= PAGE_SIZE)
> > > +		return -ENOMEM;
> > > +
> > > +	/* No partial writes. */
> > > +	if (*ppos != 0)
> > > +		return -EINVAL;
> > > +
> > > +	rc = avc_has_perm(current_selinux_ns, current_sid(),
> > > +			  SECINITSID_SECURITY,
> > > SECCLASS_SECURITY,
> > > +			  SECURITY__UNSHARE, NULL);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	page = memdup_user_nul(buf, count);
> > > +	if (IS_ERR(page))
> > > +		return PTR_ERR(page);
> > > +
> > > +	length = -EINVAL;
> > > +	if (kstrtobool(page, &set))
> > > +		goto out;
> > > +
> > > +	if (set) {
> > > +		struct cred *cred = prepare_creds();
> > > +		struct task_security_struct *tsec;
> > > +
> > > +		if (!cred) {
> > > +			length = -ENOMEM;
> > > +			goto out;
> > > +		}
> > > +		tsec = cred->security;
> > > +		if (selinux_ns_create(ns, &tsec->ns)) {
> > > +			abort_creds(cred);
> > > +			length = -ENOMEM;
> > > +			goto out;
> > > +		}
> > > +		tsec->osid = tsec->sid = SECINITSID_KERNEL;
> > > +		tsec->exec_sid = tsec->create_sid = tsec-
> > > > keycreate_sid =
> > > 
> > > +			tsec->sockcreate_sid = SECSID_NULL;
> > > +		tsec->parent_cred = get_current_cred();
> > > +		commit_creds(cred);
> > > +	}
> > > +
> > > +	length = count;
> > > +out:
> > > +	kfree(page);
> > > +	return length;
> > > +}
> > > +
> > > +static const struct file_operations sel_unshare_ops = {
> > > +	.write		= sel_write_unshare,
> > > +	.llseek		= generic_file_llseek,
> > > +};
> > > +
> > >  static ssize_t sel_read_policyvers(struct file *filp, char
> > > __user
> > > *buf,
> > >  				   size_t count, loff_t *ppos)
> > >  {
> > > @@ -1923,6 +1988,7 @@ static int sel_fill_super(struct
> > > super_block
> > > *sb, void *data, int silent)
> > >  		[SEL_POLICY] = {"policy", &sel_policy_ops,
> > > S_IRUGO},
> > >  		[SEL_VALIDATE_TRANS] = {"validatetrans",
> > > &sel_transition_ops,
> > >  					S_IWUGO},
> > > +		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops,
> > > 0222},
> > >  		/* last one */ {""}
> > >  	};
> > >  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-02 15:58 ` [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace Stephen Smalley
  2017-10-05  5:47   ` Serge E. Hallyn
@ 2017-10-06  1:07   ` James Morris
  2017-10-06 13:21     ` Stephen Smalley
  1 sibling, 1 reply; 39+ messages in thread
From: James Morris @ 2017-10-06  1:07 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, paul

On Mon, 2 Oct 2017, Stephen Smalley wrote:

> This change presumes that one will always unshare the network namespace
> when unsharing a new selinux namespace (the reverse is not required).
> Otherwise, the same inconsistencies could arise between the notifications
> and the relevant policy.  At present, nothing enforces this guarantee
> at the kernel level; it is left up to userspace (e.g. container runtimes).
> It is an open question as to whether this is a good idea or whether
> unsharing of the selinux namespace should automatically unshare the network
> namespace.  

What about logging a kernel warning if just SELinux is unshared?

I think we want to avoid surprising the user by unsharing things for them, 
and yes, it will be possible to mess your system up if you configure it 
badly.

> However, keeping them separate is consistent with the handling
> of the mount namespace currently, which also should be unshared so that
> a private selinuxfs mount can be created.

Right, and this will in practice always be automated and abstracted from 
an end user pov.


-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 05/10] selinux: support per-task/cred selinux namespace
  2017-10-02 15:58 ` [RFC 05/10] selinux: support per-task/cred selinux namespace Stephen Smalley
@ 2017-10-06  1:14   ` James Morris
  2017-10-06 19:25     ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: James Morris @ 2017-10-06  1:14 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, paul

On Mon, 2 Oct 2017, Stephen Smalley wrote:

> An alternative would be to hang the selinux namespace off of the
> user namespace, which itself is associated with the cred.  This
> seems undesirable however since DAC and MAC are orthogonal, and
> there appear to be real use cases where one will want to use selinux
> namespaces without user namespaces and vice versa. 

Indeed, an Oracle use-case is for privileged containers and for this MAC 
must remain separate.



-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-06  1:07   ` James Morris
@ 2017-10-06 13:21     ` Stephen Smalley
  2017-10-06 19:24       ` Serge E. Hallyn
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-06 13:21 UTC (permalink / raw)
  To: James Morris; +Cc: selinux

On Fri, 2017-10-06 at 12:07 +1100, James Morris wrote:
> On Mon, 2 Oct 2017, Stephen Smalley wrote:
> 
> > This change presumes that one will always unshare the network
> > namespace
> > when unsharing a new selinux namespace (the reverse is not
> > required).
> > Otherwise, the same inconsistencies could arise between the
> > notifications
> > and the relevant policy.  At present, nothing enforces this
> > guarantee
> > at the kernel level; it is left up to userspace (e.g. container
> > runtimes).
> > It is an open question as to whether this is a good idea or whether
> > unsharing of the selinux namespace should automatically unshare the
> > network
> > namespace.  
> 
> What about logging a kernel warning if just SELinux is unshared?

As with Serge's suggestion, the problem is that one can unshare them in
any order, and potentially with intervening steps to set up the
namespace or prepare for doing so, so there is no obvious point where
you could detect and issue such a warning.  Without an interface that
allows unsharing them both simultaneously (either unshare(2)-based or
selinuxfs-based), I don't think we can provide such a warning.

I don't think it will prove to be a problem in practice however;
container runtimes just need to do the right thing (and we can help
this by providing helpers in libselinux or the like).  The larger
concern is not that we'll forget to unshare the network namespace when
we unshare the selinux namespace, but that subsequent further unsharing
of the network namespace by itself could cause lossage of
notifications.  The two cases of concern are that a process unshares
its network namespace again (after the original unsharing of both
selinux namespace and network namespace for the container creation) and
subsequently:

1) Does not get any netlink notifications of setenforce or policy load
events for its selinux namespace. This is only an issue if a program
that uses the userspace AVC also unshares its network namespace or
otherwise is launched into its own network namespace separate from that
of its container.  And it isn't a regression, since before this change
notifications would only be sent to the init network namespace ever, so
this change actually represents an improvement in the ability to at
least get notifications when running in the container's network
namespace.

2) Sets enforcing mode or loads policy itself, in which case the
notification for its setenforce or load_policy will only go to its
network namespace and will not be received by other processes in the
same selinux namespace.  This is only an issue if a process running in
a separate network namespace from that of its container sets enforcing
mode or loads policy.  This seems unlikely to me, since such setting of
enforcing mode or loading of policy will conventionally be restricted
to a small set of privileged processes, such as the container init
process, administrator shells, and package installation/updates, and I
wouldn't expect them to run in a separate network namespace than their
container.

> 
> I think we want to avoid surprising the user by unsharing things for
> them, 
> and yes, it will be possible to mess your system up if you configure
> it 
> badly.
> 
> > However, keeping them separate is consistent with the handling
> > of the mount namespace currently, which also should be unshared so
> > that
> > a private selinuxfs mount can be created.
> 
> Right, and this will in practice always be automated and abstracted
> from 
> an end user pov.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-06 13:21     ` Stephen Smalley
@ 2017-10-06 19:24       ` Serge E. Hallyn
  2017-10-10 14:35         ` Stephen Smalley
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2017-10-06 19:24 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: James Morris, selinux

Quoting Stephen Smalley (sds@tycho.nsa.gov):
> On Fri, 2017-10-06 at 12:07 +1100, James Morris wrote:
> > On Mon, 2 Oct 2017, Stephen Smalley wrote:
> > 
> > > This change presumes that one will always unshare the network
> > > namespace
> > > when unsharing a new selinux namespace (the reverse is not
> > > required).
> > > Otherwise, the same inconsistencies could arise between the
> > > notifications
> > > and the relevant policy.  At present, nothing enforces this
> > > guarantee
> > > at the kernel level; it is left up to userspace (e.g. container
> > > runtimes).
> > > It is an open question as to whether this is a good idea or whether
> > > unsharing of the selinux namespace should automatically unshare the
> > > network
> > > namespace.  
> > 
> > What about logging a kernel warning if just SELinux is unshared?
> 
> As with Serge's suggestion, the problem is that one can unshare them in
> any order, and potentially with intervening steps to set up the

But is it going to stay that way?  I thought that was a temporary thing
while you test it out.

Because as it is it seems unacceptably loose.  A few questions:

(thinking as I type here - ok only the last one is the real question)

Assume I'm in my given netns.  I unshare just my selinuxns.  Does the
selinuxns have the authority to decide things about that netns?

If I unshare selinuxns+netns, then the selinuxns clearly "owns" the netns,
so the selinuxns has clear authority.  Likewise, when I unshare the
netns after the selinuxns, from that point on the selinuxns can be said
to have authority over the netns.

I think you want to keep it completely orthogonal, and I guess so long
as the parent selinuxns policy still applies it's ok.

So I unshare my selinuxns but not userns.  I don't have CAP_MAC_ADMIN
against the user_ns, so any selinux related changes will be denied.
Once I unshare userns, they will be allowed.

Ah, right.  Here's the real question.  Policy dictates file transition
rules.  How will those be namespaced?  Will only the init_selinux_ns be
allowed to specify a file context?  You can't let ns_capable(current_selinux_ns,
CAP_MAC_ADMIN) guide that safely, but capable(CAP_MAC_ADMIN) will be
very restrictive.  What's the plan there?  Sorry if it's spelled out
elsewhere.

-serge

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 05/10] selinux: support per-task/cred selinux namespace
  2017-10-06  1:14   ` James Morris
@ 2017-10-06 19:25     ` Serge E. Hallyn
  2017-10-08 22:08       ` James Morris
  0 siblings, 1 reply; 39+ messages in thread
From: Serge E. Hallyn @ 2017-10-06 19:25 UTC (permalink / raw)
  To: James Morris; +Cc: Stephen Smalley, selinux

Quoting James Morris (jmorris@namei.org):
> On Mon, 2 Oct 2017, Stephen Smalley wrote:
> 
> > An alternative would be to hang the selinux namespace off of the
> > user namespace, which itself is associated with the cred.  This
> > seems undesirable however since DAC and MAC are orthogonal, and
> > there appear to be real use cases where one will want to use selinux
> > namespaces without user namespaces and vice versa. 
> 
> Indeed, an Oracle use-case is for privileged containers and for this MAC 
> must remain separate.

Will that always be the case?  Is that to allow (selinux-confined) device
administration from containers?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 05/10] selinux: support per-task/cred selinux namespace
  2017-10-06 19:25     ` Serge E. Hallyn
@ 2017-10-08 22:08       ` James Morris
  0 siblings, 0 replies; 39+ messages in thread
From: James Morris @ 2017-10-08 22:08 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Stephen Smalley, selinux

On Fri, 6 Oct 2017, Serge E. Hallyn wrote:

> Quoting James Morris (jmorris@namei.org):
> > On Mon, 2 Oct 2017, Stephen Smalley wrote:
> > 
> > > An alternative would be to hang the selinux namespace off of the
> > > user namespace, which itself is associated with the cred.  This
> > > seems undesirable however since DAC and MAC are orthogonal, and
> > > there appear to be real use cases where one will want to use selinux
> > > namespaces without user namespaces and vice versa. 
> > 
> > Indeed, an Oracle use-case is for privileged containers and for this MAC 
> > must remain separate.
> 
> Will that always be the case?  Is that to allow (selinux-confined) device
> administration from containers?

It's to provide the user with a full OS experience generally.  It's not 
necessarily the only use-case, though.



-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-05 15:27   ` Stephen Smalley
  2017-10-05 15:49     ` Stephen Smalley
@ 2017-10-09  1:52     ` James Morris
       [not found]       ` <CAB9W1A2-PT8QU-md1s9fxhNg+Cv0C4Xu-i1w_q0XzQ+K9rsyAg@mail.gmail.com>
  1 sibling, 1 reply; 39+ messages in thread
From: James Morris @ 2017-10-09  1:52 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, Paul Moore

On Thu, 5 Oct 2017, Stephen Smalley wrote:

> inet_socket test failures are expected due to running in a non-init
> network namespace; they don't work even without unsharing the selinux
> namespace.

Do these results all look as expected?

Test Summary Report
-------------------
fdreceive/test         (Wstat: 0 Tests: 3 Failed: 1)
  Failed test:  3
inherit/test           (Wstat: 0 Tests: 3 Failed: 1)
  Failed test:  1
file/test              (Wstat: 0 Tests: 16 Failed: 1)
  Failed test:  8
bounds/test            (Wstat: 0 Tests: 24 Failed: 5)
  Failed tests:  3, 6, 12, 21, 23
mmap/test              (Wstat: 0 Tests: 46 Failed: 2)
  Failed tests:  9, 13
inet_socket/test       (Wstat: 3584 Tests: 33 Failed: 14)
  Failed tests:  1, 3, 5-6, 8, 16, 18, 20, 22, 25-26, 28
                30, 32
  Non-zero exit status: 14
overlay/test           (Wstat: 3072 Tests: 121 Failed: 12)
  Failed tests:  1, 25, 27, 39-40, 57, 63, 87, 89, 98-99
                116
  Non-zero exit status: 12
Files=51, Tests=485, 28 wallclock secs ( 0.60 usr  0.13 sys +  2.67 cusr  
3.76 csys =  7.16 CPU)
Result: FAIL
Failed 7/51 test programs. 36/485 subtests failed.


-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 03/10] selinux: move the AVC into the selinux namespace
  2017-10-02 15:58 ` [RFC 03/10] selinux: move the AVC into the selinux namespace Stephen Smalley
@ 2017-10-09  3:10   ` James Morris
  2017-10-10 14:35     ` Stephen Smalley
  0 siblings, 1 reply; 39+ messages in thread
From: James Morris @ 2017-10-09  3:10 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, paul

On Mon, 2 Oct 2017, Stephen Smalley wrote:

> Move the access vector cache (AVC) into the selinux namespace
> structure and pass it explicitly to all AVC functions.  The
> AVC private state is encapsulated in a selinux_avc structure
> that is allocated and freed by the AVC during selinux namespace
> creation and destruction.
> 
> This is necessary to support multiple selinux namespaces since
> the AVC caches state (e.g. SIDs, policy sequence number) that
> is maintained and provided by the security server on a per-namespace
> basis.

What about per-namespace AVC stats?

At the moment, it seems that the stats for all AVCs are combined in the 
existing percpu stats, which could be confusing for someone trying to tune 
the host or a guest, as the hash stats & config are per-namespace.  Also, 
a user likely wants to see only their own AVC stats generally.


-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
       [not found]       ` <CAB9W1A2-PT8QU-md1s9fxhNg+Cv0C4Xu-i1w_q0XzQ+K9rsyAg@mail.gmail.com>
@ 2017-10-09 13:53         ` Stephen Smalley
  2017-10-09 23:04           ` James Morris
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2017-10-09 13:53 UTC (permalink / raw)
  To: James Morris; +Cc: Stephen D. Smalley, selinux

[-- Attachment #1: Type: text/plain, Size: 1470 bytes --]

On Oct 8, 2017 9:54 PM, "James Morris" <jmorris@namei.org> wrote:

On Thu, 5 Oct 2017, Stephen Smalley wrote:

> inet_socket test failures are expected due to running in a non-init
> network namespace; they don't work even without unsharing the selinux
> namespace.

Do these results all look as expected?


No, that suggests that you either didn't insert the policy module allowing
access to unlabeled fds or you didn't run restorecon -R /dev before running
the tests. The only expected failures are the inet socket ones.


Test Summary Report
-------------------
fdreceive/test         (Wstat: 0 Tests: 3 Failed: 1)
  Failed test:  3
inherit/test           (Wstat: 0 Tests: 3 Failed: 1)
  Failed test:  1
file/test              (Wstat: 0 Tests: 16 Failed: 1)
  Failed test:  8
bounds/test            (Wstat: 0 Tests: 24 Failed: 5)
  Failed tests:  3, 6, 12, 21, 23
mmap/test              (Wstat: 0 Tests: 46 Failed: 2)
  Failed tests:  9, 13
inet_socket/test       (Wstat: 3584 Tests: 33 Failed: 14)
  Failed tests:  1, 3, 5-6, 8, 16, 18, 20, 22, 25-26, 28
                30, 32
  Non-zero exit status: 14
overlay/test           (Wstat: 3072 Tests: 121 Failed: 12)
  Failed tests:  1, 25, 27, 39-40, 57, 63, 87, 89, 98-99
                116
  Non-zero exit status: 12
Files=51, Tests=485, 28 wallclock secs ( 0.60 usr  0.13 sys +  2.67 cusr
3.76 csys =  7.16 CPU)
Result: FAIL
Failed 7/51 test programs. 36/485 subtests failed.


--
James Morris
<jmorris@namei.org>

[-- Attachment #2: Type: text/html, Size: 2420 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2017-10-09 13:53         ` Stephen Smalley
@ 2017-10-09 23:04           ` James Morris
  0 siblings, 0 replies; 39+ messages in thread
From: James Morris @ 2017-10-09 23:04 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Stephen D. Smalley, selinux

On Mon, 9 Oct 2017, Stephen Smalley wrote:

> On Oct 8, 2017 9:54 PM, "James Morris" <jmorris@namei.org> wrote:
> 
> On Thu, 5 Oct 2017, Stephen Smalley wrote:
> 
> > inet_socket test failures are expected due to running in a non-init
> > network namespace; they don't work even without unsharing the selinux
> > namespace.
> 
> Do these results all look as expected?
> 
> 
> No, that suggests that you either didn't insert the policy module allowing
> access to unlabeled fds or you didn't run restorecon -R /dev before running
> the tests. The only expected failures are the inet socket ones.
> 

Looking better now -- I think it was the restorecon.


-- 
James Morris
<jmorris@namei.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-06 19:24       ` Serge E. Hallyn
@ 2017-10-10 14:35         ` Stephen Smalley
  0 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-10 14:35 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: selinux

On Fri, 2017-10-06 at 14:24 -0500, Serge E. Hallyn wrote:
> Quoting Stephen Smalley (sds@tycho.nsa.gov):
> > On Fri, 2017-10-06 at 12:07 +1100, James Morris wrote:
> > > On Mon, 2 Oct 2017, Stephen Smalley wrote:
> > > 
> > > > This change presumes that one will always unshare the network
> > > > namespace
> > > > when unsharing a new selinux namespace (the reverse is not
> > > > required).
> > > > Otherwise, the same inconsistencies could arise between the
> > > > notifications
> > > > and the relevant policy.  At present, nothing enforces this
> > > > guarantee
> > > > at the kernel level; it is left up to userspace (e.g. container
> > > > runtimes).
> > > > It is an open question as to whether this is a good idea or
> > > > whether
> > > > unsharing of the selinux namespace should automatically unshare
> > > > the
> > > > network
> > > > namespace.  
> > > 
> > > What about logging a kernel warning if just SELinux is unshared?
> > 
> > As with Serge's suggestion, the problem is that one can unshare
> > them in
> > any order, and potentially with intervening steps to set up the
> 
> But is it going to stay that way?  I thought that was a temporary
> thing
> while you test it out.
> 
> Because as it is it seems unacceptably loose.  A few questions:
> 
> (thinking as I type here - ok only the last one is the real question)
> 
> Assume I'm in my given netns.  I unshare just my selinuxns.  Does the
> selinuxns have the authority to decide things about that netns?
> 
> If I unshare selinuxns+netns, then the selinuxns clearly "owns" the
> netns,
> so the selinuxns has clear authority.  Likewise, when I unshare the
> netns after the selinuxns, from that point on the selinuxns can be
> said
> to have authority over the netns.
> 
> I think you want to keep it completely orthogonal, and I guess so
> long
> as the parent selinuxns policy still applies it's ok.
> 
> So I unshare my selinuxns but not userns.  I don't have CAP_MAC_ADMIN
> against the user_ns, so any selinux related changes will be denied.
> Once I unshare userns, they will be allowed.
> 
> Ah, right.  Here's the real question.  Policy dictates file
> transition
> rules.  How will those be namespaced?  Will only the init_selinux_ns
> be
> allowed to specify a file context?  You can't let
> ns_capable(current_selinux_ns,
> CAP_MAC_ADMIN) guide that safely, but capable(CAP_MAC_ADMIN) will be
> very restrictive.  What's the plan there?  Sorry if it's spelled out
> elsewhere.

I don't think we know yet.  The current patches support a separate and
distinct in-core inode security context for each namespace, which
supports scenarios such as using context mounts on the host OS to label
each container with a single context with MCS separation while using
per-file xattrs within a container to support a conventional targeted
policy, but we haven't yet resolved how to deal with the actual
persistent file security contexts, e.g. whether there will be only one
(possibly with a label mapping defined) or multiple.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 03/10] selinux: move the AVC into the selinux namespace
  2017-10-09  3:10   ` James Morris
@ 2017-10-10 14:35     ` Stephen Smalley
  0 siblings, 0 replies; 39+ messages in thread
From: Stephen Smalley @ 2017-10-10 14:35 UTC (permalink / raw)
  To: James Morris; +Cc: selinux

On Mon, 2017-10-09 at 14:10 +1100, James Morris wrote:
> On Mon, 2 Oct 2017, Stephen Smalley wrote:
> 
> > Move the access vector cache (AVC) into the selinux namespace
> > structure and pass it explicitly to all AVC functions.  The
> > AVC private state is encapsulated in a selinux_avc structure
> > that is allocated and freed by the AVC during selinux namespace
> > creation and destruction.
> > 
> > This is necessary to support multiple selinux namespaces since
> > the AVC caches state (e.g. SIDs, policy sequence number) that
> > is maintained and provided by the security server on a per-
> > namespace
> > basis.
> 
> What about per-namespace AVC stats?
> 
> At the moment, it seems that the stats for all AVCs are combined in
> the 
> existing percpu stats, which could be confusing for someone trying to
> tune 
> the host or a guest, as the hash stats & config are per-
> namespace.  Also, 
> a user likely wants to see only their own AVC stats generally.

Yes, we should likely split those too; something else to add to the
TODO list for this patch series.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace
  2017-10-05 14:06     ` Stephen Smalley
  2017-10-05 14:11       ` Stephen Smalley
@ 2017-10-29  3:16       ` Serge E. Hallyn
  1 sibling, 0 replies; 39+ messages in thread
From: Serge E. Hallyn @ 2017-10-29  3:16 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Serge E. Hallyn, selinux, James Morris, Paul Moore

On Thu, Oct 05, 2017 at 10:06:55AM -0400, Stephen Smalley wrote:
> On Thu, 2017-10-05 at 00:47 -0500, Serge E. Hallyn wrote:
> > On Mon, Oct 02, 2017 at 11:58:19AM -0400, Stephen Smalley wrote:
> > > The selinux netlink socket is used to notify userspace of changes
> > > to
> > > the enforcing mode and policy reloads.  At present, these
> > > notifications
> > > are always sent to the initial network namespace.  In order to
> > > support
> > > multiple selinux namespaces, each with its own enforcing mode and
> > > policy, we need to create and use a separate selinux netlink socket
> > > for each network namespace.
> > 
> > ...
> > 
> > > +static int __init selnl_init(void)
> > > +{
> > > +	if (register_pernet_subsys(&selnl_net_ops))
> > > +		panic("Could not register selinux netlink
> > > operations\n");
> > >  	return 0;
> > >  }
> > 
> > This doesn't seem right to me.  If the socket is only used to send
> > notifications to userspace, then every net_ns doesn't need a socket,
> > only the first netns that the selinux ns was associated, right?
> 
> What does "the first netns that the selinux ns was associated" mean?
> We could unshare them in any order; in the sample command sequence I
> gave in the patch description for "selinux: add a selinuxfs interface
> to unshare selinux namespace", I unshared the SELinux namespace first,
> then the network namespace, but we could just as easily do it in the
> reverse order (or at the same time if unshare(2) supported that).  So
> you can't assume that the network namespace in which you are running at
> the time you unshare selinux namespace is the right one, nor that the
> first unshare of the network namespace after unsharing the selinux
> namespace is the right one (not that we even have a way to catch that
> currently).
> 
> > So long as there is a way to find the netns to which an selinux ns
> > is tied, a userspace logger could even setns into that netns to
> > listen
> > for updates, if it wasn't certain to be in the right ns when it ran.
> > 
> > Otherwise (I haven't peeked ahead) you'll have to keep the *list* of
> > net_ns which live in a given selinuxfs and copy all messages to all
> > of
> > those namesapces?
> 
> No, we only deliver to the network namespace of the process that
> performed the setenforce or policy load (most commonly init, could also

(Oops, I see I never replied to this,)

I see - sounds good then, thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 01/10] selinux: introduce a selinux namespace
  2017-10-02 15:58 ` [RFC 01/10] selinux: introduce a selinux namespace Stephen Smalley
@ 2018-02-06 22:18   ` Paul Moore
  2018-02-07 16:17     ` Paul Moore
  2018-02-07 17:48     ` Stephen Smalley
  0 siblings, 2 replies; 39+ messages in thread
From: Paul Moore @ 2018-02-06 22:18 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, James Morris

On Mon, Oct 2, 2017 at 11:58 AM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
> Define a selinux namespace structure (struct selinux_ns)
> for SELinux state and pass it explicitly to all security server
> functions.  The public portion of the structure contains state
> that is used throughout the SELinux code, such as the enforcing mode.
> The structure also contains a pointer to a selinux_ss structure whose
> definition is private to the security server and contains security
> server specific state such as the policy database and SID table.
>
> This change allocates a single selinux namespace, the init_selinux_ns.
> It defines and passes a symbol for the current selinux namespace
> (current_selinux_ns) as a placeholder for future changes where
> multiple selinux namespaces will be supported, but in this change
> the current selinux namespace is always the init selinux namespace.
> Note that passing the current selinux namespace is not correct for
> all hooks; some hooks will need to be adjusted to pass the selinux
> namespace associated with an open file, a network namespace or socket,
> etc, since not all hooks are invoked in process context and some
> hooks operate in the context of a cred that may differ from current's
> cred.  Fixing all of these cases is left to future changes, once
> we introduce the support for multiple selinux namespaces.
>
> This change by itself should have no effect on SELinux behavior or
> APIs (userspace or LSM).  It merely wraps SELinux state and passes it
> explicitly as needed.

To put things in context, I'm looking at this patch not really with
namespacing in mind, but rather with the idea of encapsulating the
SELinux global state.  Regardless of what we do with namespacing, I
believe the encapsulation work still has value.

My comments are inline below.  I'm going to try and trim out a lot of
this patch in my reply as most of are trivial changes needed to pass
around the new state pointers.

> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index f5d3047..9eb48a1 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -97,20 +97,24 @@
>  #include "audit.h"
>  #include "avc_ss.h"
>
> +struct selinux_ns *init_selinux_ns;

This comment applies to more than just the particular line above, it
really applies to the entire patch.

If we are going to merge this patch, and presumably a few others,
independent of the namespacing work, I would strongly prefer if we
used more general names instead of ones tied so closely to
namespacing.  For example, something along the lines of "struct
selinux_state" would be more desirable than "struct selinux_ns"; it
doesn't have to be "selinux_state", that was just the first thing I
could come up with.

> +static void selinux_ns_free(struct work_struct *work);
> +
> +int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns)
> +{
> +       struct selinux_ns *newns;
> +       int rc;
> +
> +       newns = kzalloc(sizeof(*newns), GFP_KERNEL);
> +       if (!newns)
> +               return -ENOMEM;
> +
> +       refcount_set(&newns->count, 1);
> +       INIT_WORK(&newns->work, selinux_ns_free);
> +
> +       rc = selinux_ss_create(&newns->ss);
> +       if (rc)
> +               goto err;
> +
> +       if (parent)
> +               newns->parent = get_selinux_ns(parent);
> +
> +       *ns = newns;
> +       return 0;
> +err:
> +       kfree(newns);
> +       return rc;
> +}
> +
> +static void selinux_ns_free(struct work_struct *work)
> +{
> +       struct selinux_ns *parent, *ns =
> +               container_of(work, struct selinux_ns, work);
> +
> +       do {
> +               parent = ns->parent;
> +               selinux_ss_free(ns->ss);
> +               kfree(ns);
> +               ns = parent;
> +       } while (ns && refcount_dec_and_test(&ns->count));
> +}
> +
> +void __put_selinux_ns(struct selinux_ns *ns)
> +{
> +       schedule_work(&ns->work);
> +}

While we would obviously need something like this for proper
namespacing, with only a single state struct we don't need all the
state/namespace management.

However, I would still be willing to accept all the changes to pass
around a reference to the global state struct, even if in this
particular case it was always going to be single/init struct.

> @@ -6487,6 +6585,11 @@ static __init int selinux_init(void)
>
>         printk(KERN_INFO "SELinux:  Initializing.\n");
>
> +       if (selinux_ns_create(NULL, &init_selinux_ns))
> +               panic("SELinux: Could not create initial namespace\n");

Not yet :)  If we stick with selinux_ns_create(), see my comments
above, at the very least we need to pick an error message that doesn't
hint at namespacing.

> @@ -6629,23 +6738,32 @@ static void selinux_nf_ip_exit(void)
>  #endif /* CONFIG_NETFILTER */
>
>  #ifdef CONFIG_SECURITY_SELINUX_DISABLE
> -static int selinux_disabled;
> -
> -int selinux_disable(void)
> +int selinux_disable(struct selinux_ns *ns)
>  {
> -       if (ss_initialized) {
> +       if (ns->initialized) {
>                 /* Not permitted after initial policy load. */
>                 return -EINVAL;
>         }
>
> -       if (selinux_disabled) {
> +       if (ns->disabled) {
>                 /* Only do this once. */
>                 return -EINVAL;
>         }
>
> +       ns->disabled = 1;
> +
> +       /*
> +        * Disable of a non-init ns does not disable SELinux in the host.
> +        * We simply let the disable succeed, and init will then
> +        * unmount its selinuxfs instance and subsequent userspace
> +        * within the ns will interpret the absence of a selinuxfs mount
> +        * as SELinux being disabled.
> +        */
> +       if (ns != init_selinux_ns)
> +               return 0;

Something else we don't really need at this point.

> diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
> index 28dfb2f..b70d1dd 100644
> --- a/security/selinux/include/security.h
> +++ b/security/selinux/include/security.h
> @@ -97,13 +92,80 @@ extern int selinux_policycap_nnp_nosuid_transition;
>  /* limitation of boundary depth  */
>  #define POLICYDB_BOUNDS_MAXDEPTH       4
>
> -int security_mls_enabled(void);
> +struct selinux_ss;
> +
> +struct selinux_ns {
> +       refcount_t count;
> +       struct work_struct work;
> +       bool disabled;
> +#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
> +       bool enforcing;
> +#endif
> +       bool checkreqprot;
> +       bool initialized;
> +       bool policycap[__POLICYDB_CAPABILITY_MAX];
> +       struct selinux_ss *ss;

While I don't think we need to tackle this as part of the
encapsulation work, this is another reminder that we should look into
breaking the separation between the security server and the
Linux/hooks code.  I understand there were historical reasons for this
split, but I think all of those reasons are now gone, and further I
think enough Linux'isms have crept into the security server that the
separation is no longer as meaningful as it may have been in the past.

> +#define current_selinux_ns (init_selinux_ns)
>
> +#define ss_initialized (current_selinux_ns->initialized)
> +
> +#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
> +#define selinux_enforcing (current_selinux_ns->enforcing)
> +#define ns_enforcing(ns) ((ns)->enforcing)
> +#define set_ns_enforcing(ns, value) ((ns)->enforcing = value)
> +#else
> +#define selinux_enforcing 1
> +#define ns_enforcing(ns) 1
> +#define set_ns_enforcing(ns, value)
> +#endif
> +
> +#define selinux_checkreqprot (current_selinux_ns->checkreqprot)
> +
> +#define selinux_policycap_netpeer \
> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_NETPEER])
> +#define selinux_policycap_openperm \
> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_OPENPERM])
> +#define selinux_policycap_extsockclass \
> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_EXTSOCKCLASS])
> +#define selinux_policycap_alwaysnetwork \
> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_ALWAYSNETWORK])
> +#define selinux_policycap_cgroupseclabel \
> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_CGROUPSECLABEL])
> +#define selinux_policycap_nnp_nosuid_transition \
> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_NNP_NOSUID_TRANSITION])

I realize this was likely the quickest solution, but I think I would
prefer to see stuff like the above written as small static inline
functions.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 01/10] selinux: introduce a selinux namespace
  2018-02-06 22:18   ` Paul Moore
@ 2018-02-07 16:17     ` Paul Moore
  2018-02-07 17:48     ` Stephen Smalley
  1 sibling, 0 replies; 39+ messages in thread
From: Paul Moore @ 2018-02-07 16:17 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, James Morris

On Tue, Feb 6, 2018 at 5:18 PM, Paul Moore <paul@paul-moore.com> wrote:
> On Mon, Oct 2, 2017 at 11:58 AM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
>> Define a selinux namespace structure (struct selinux_ns)
>> for SELinux state and pass it explicitly to all security server
>> functions.  The public portion of the structure contains state
>> that is used throughout the SELinux code, such as the enforcing mode.
>> The structure also contains a pointer to a selinux_ss structure whose
>> definition is private to the security server and contains security
>> server specific state such as the policy database and SID table.
>>
>> This change allocates a single selinux namespace, the init_selinux_ns.
>> It defines and passes a symbol for the current selinux namespace
>> (current_selinux_ns) as a placeholder for future changes where
>> multiple selinux namespaces will be supported, but in this change
>> the current selinux namespace is always the init selinux namespace.
>> Note that passing the current selinux namespace is not correct for
>> all hooks; some hooks will need to be adjusted to pass the selinux
>> namespace associated with an open file, a network namespace or socket,
>> etc, since not all hooks are invoked in process context and some
>> hooks operate in the context of a cred that may differ from current's
>> cred.  Fixing all of these cases is left to future changes, once
>> we introduce the support for multiple selinux namespaces.
>>
>> This change by itself should have no effect on SELinux behavior or
>> APIs (userspace or LSM).  It merely wraps SELinux state and passes it
>> explicitly as needed.
>
> To put things in context, I'm looking at this patch not really with
> namespacing in mind, but rather with the idea of encapsulating the
> SELinux global state.  Regardless of what we do with namespacing, I
> believe the encapsulation work still has value.

With the above comment in mind, I've started looking at the remaining
patches in the patchset to see what might be worth merging,
independent of the greater namespacing effort.

* 02/10
My comments are very similar to 01/10; I think the encapsulation is a
good thing, although perhaps not as important as the changes in
01/100, but there are some namespace specific things that should
probably be dropped.

* 03/10
Once again, similar comments to the previous two patches, although
this patch is perhaps the least "namespace-y" of the three.  While I
agree with James comments regarding namespaced stats, that isn't a
major concern at this point.

* {04..10}/10
Changes specific to namespacing, not generally useful in other contexts.

> My comments are inline below.  I'm going to try and trim out a lot of
> this patch in my reply as most of are trivial changes needed to pass
> around the new state pointers.
>
>> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
>> index f5d3047..9eb48a1 100644
>> --- a/security/selinux/hooks.c
>> +++ b/security/selinux/hooks.c
>> @@ -97,20 +97,24 @@
>>  #include "audit.h"
>>  #include "avc_ss.h"
>>
>> +struct selinux_ns *init_selinux_ns;
>
> This comment applies to more than just the particular line above, it
> really applies to the entire patch.
>
> If we are going to merge this patch, and presumably a few others,
> independent of the namespacing work, I would strongly prefer if we
> used more general names instead of ones tied so closely to
> namespacing.  For example, something along the lines of "struct
> selinux_state" would be more desirable than "struct selinux_ns"; it
> doesn't have to be "selinux_state", that was just the first thing I
> could come up with.
>
>> +static void selinux_ns_free(struct work_struct *work);
>> +
>> +int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns)
>> +{
>> +       struct selinux_ns *newns;
>> +       int rc;
>> +
>> +       newns = kzalloc(sizeof(*newns), GFP_KERNEL);
>> +       if (!newns)
>> +               return -ENOMEM;
>> +
>> +       refcount_set(&newns->count, 1);
>> +       INIT_WORK(&newns->work, selinux_ns_free);
>> +
>> +       rc = selinux_ss_create(&newns->ss);
>> +       if (rc)
>> +               goto err;
>> +
>> +       if (parent)
>> +               newns->parent = get_selinux_ns(parent);
>> +
>> +       *ns = newns;
>> +       return 0;
>> +err:
>> +       kfree(newns);
>> +       return rc;
>> +}
>> +
>> +static void selinux_ns_free(struct work_struct *work)
>> +{
>> +       struct selinux_ns *parent, *ns =
>> +               container_of(work, struct selinux_ns, work);
>> +
>> +       do {
>> +               parent = ns->parent;
>> +               selinux_ss_free(ns->ss);
>> +               kfree(ns);
>> +               ns = parent;
>> +       } while (ns && refcount_dec_and_test(&ns->count));
>> +}
>> +
>> +void __put_selinux_ns(struct selinux_ns *ns)
>> +{
>> +       schedule_work(&ns->work);
>> +}
>
> While we would obviously need something like this for proper
> namespacing, with only a single state struct we don't need all the
> state/namespace management.
>
> However, I would still be willing to accept all the changes to pass
> around a reference to the global state struct, even if in this
> particular case it was always going to be single/init struct.
>
>> @@ -6487,6 +6585,11 @@ static __init int selinux_init(void)
>>
>>         printk(KERN_INFO "SELinux:  Initializing.\n");
>>
>> +       if (selinux_ns_create(NULL, &init_selinux_ns))
>> +               panic("SELinux: Could not create initial namespace\n");
>
> Not yet :)  If we stick with selinux_ns_create(), see my comments
> above, at the very least we need to pick an error message that doesn't
> hint at namespacing.
>
>> @@ -6629,23 +6738,32 @@ static void selinux_nf_ip_exit(void)
>>  #endif /* CONFIG_NETFILTER */
>>
>>  #ifdef CONFIG_SECURITY_SELINUX_DISABLE
>> -static int selinux_disabled;
>> -
>> -int selinux_disable(void)
>> +int selinux_disable(struct selinux_ns *ns)
>>  {
>> -       if (ss_initialized) {
>> +       if (ns->initialized) {
>>                 /* Not permitted after initial policy load. */
>>                 return -EINVAL;
>>         }
>>
>> -       if (selinux_disabled) {
>> +       if (ns->disabled) {
>>                 /* Only do this once. */
>>                 return -EINVAL;
>>         }
>>
>> +       ns->disabled = 1;
>> +
>> +       /*
>> +        * Disable of a non-init ns does not disable SELinux in the host.
>> +        * We simply let the disable succeed, and init will then
>> +        * unmount its selinuxfs instance and subsequent userspace
>> +        * within the ns will interpret the absence of a selinuxfs mount
>> +        * as SELinux being disabled.
>> +        */
>> +       if (ns != init_selinux_ns)
>> +               return 0;
>
> Something else we don't really need at this point.
>
>> diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
>> index 28dfb2f..b70d1dd 100644
>> --- a/security/selinux/include/security.h
>> +++ b/security/selinux/include/security.h
>> @@ -97,13 +92,80 @@ extern int selinux_policycap_nnp_nosuid_transition;
>>  /* limitation of boundary depth  */
>>  #define POLICYDB_BOUNDS_MAXDEPTH       4
>>
>> -int security_mls_enabled(void);
>> +struct selinux_ss;
>> +
>> +struct selinux_ns {
>> +       refcount_t count;
>> +       struct work_struct work;
>> +       bool disabled;
>> +#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
>> +       bool enforcing;
>> +#endif
>> +       bool checkreqprot;
>> +       bool initialized;
>> +       bool policycap[__POLICYDB_CAPABILITY_MAX];
>> +       struct selinux_ss *ss;
>
> While I don't think we need to tackle this as part of the
> encapsulation work, this is another reminder that we should look into
> breaking the separation between the security server and the
> Linux/hooks code.  I understand there were historical reasons for this
> split, but I think all of those reasons are now gone, and further I
> think enough Linux'isms have crept into the security server that the
> separation is no longer as meaningful as it may have been in the past.
>
>> +#define current_selinux_ns (init_selinux_ns)
>>
>> +#define ss_initialized (current_selinux_ns->initialized)
>> +
>> +#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
>> +#define selinux_enforcing (current_selinux_ns->enforcing)
>> +#define ns_enforcing(ns) ((ns)->enforcing)
>> +#define set_ns_enforcing(ns, value) ((ns)->enforcing = value)
>> +#else
>> +#define selinux_enforcing 1
>> +#define ns_enforcing(ns) 1
>> +#define set_ns_enforcing(ns, value)
>> +#endif
>> +
>> +#define selinux_checkreqprot (current_selinux_ns->checkreqprot)
>> +
>> +#define selinux_policycap_netpeer \
>> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_NETPEER])
>> +#define selinux_policycap_openperm \
>> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_OPENPERM])
>> +#define selinux_policycap_extsockclass \
>> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_EXTSOCKCLASS])
>> +#define selinux_policycap_alwaysnetwork \
>> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_ALWAYSNETWORK])
>> +#define selinux_policycap_cgroupseclabel \
>> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_CGROUPSECLABEL])
>> +#define selinux_policycap_nnp_nosuid_transition \
>> +       (current_selinux_ns->policycap[POLICYDB_CAPABILITY_NNP_NOSUID_TRANSITION])
>
> I realize this was likely the quickest solution, but I think I would
> prefer to see stuff like the above written as small static inline
> functions.
>
> --
> paul moore
> www.paul-moore.com



-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 01/10] selinux: introduce a selinux namespace
  2018-02-06 22:18   ` Paul Moore
  2018-02-07 16:17     ` Paul Moore
@ 2018-02-07 17:48     ` Stephen Smalley
  2018-02-07 19:56       ` Paul Moore
  1 sibling, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2018-02-07 17:48 UTC (permalink / raw)
  To: Paul Moore; +Cc: selinux, James Morris

On Tue, 2018-02-06 at 17:18 -0500, Paul Moore wrote:
> On Mon, Oct 2, 2017 at 11:58 AM, Stephen Smalley <sds@tycho.nsa.gov>
> wrote:
> > Define a selinux namespace structure (struct selinux_ns)
> > for SELinux state and pass it explicitly to all security server
> > functions.  The public portion of the structure contains state
> > that is used throughout the SELinux code, such as the enforcing
> > mode.
> > The structure also contains a pointer to a selinux_ss structure
> > whose
> > definition is private to the security server and contains security
> > server specific state such as the policy database and SID table.
> > 
> > This change allocates a single selinux namespace, the
> > init_selinux_ns.
> > It defines and passes a symbol for the current selinux namespace
> > (current_selinux_ns) as a placeholder for future changes where
> > multiple selinux namespaces will be supported, but in this change
> > the current selinux namespace is always the init selinux namespace.
> > Note that passing the current selinux namespace is not correct for
> > all hooks; some hooks will need to be adjusted to pass the selinux
> > namespace associated with an open file, a network namespace or
> > socket,
> > etc, since not all hooks are invoked in process context and some
> > hooks operate in the context of a cred that may differ from
> > current's
> > cred.  Fixing all of these cases is left to future changes, once
> > we introduce the support for multiple selinux namespaces.
> > 
> > This change by itself should have no effect on SELinux behavior or
> > APIs (userspace or LSM).  It merely wraps SELinux state and passes
> > it
> > explicitly as needed.
> 
> To put things in context, I'm looking at this patch not really with
> namespacing in mind, but rather with the idea of encapsulating the
> SELinux global state.  Regardless of what we do with namespacing, I
> believe the encapsulation work still has value.
> 
> My comments are inline below.  I'm going to try and trim out a lot of
> this patch in my reply as most of are trivial changes needed to pass
> around the new state pointers.
> 
> > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > index f5d3047..9eb48a1 100644
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -97,20 +97,24 @@
> >  #include "audit.h"
> >  #include "avc_ss.h"
> > 
> > +struct selinux_ns *init_selinux_ns;
> 
> This comment applies to more than just the particular line above, it
> really applies to the entire patch.
> 
> If we are going to merge this patch, and presumably a few others,
> independent of the namespacing work, I would strongly prefer if we
> used more general names instead of ones tied so closely to
> namespacing.  For example, something along the lines of "struct
> selinux_state" would be more desirable than "struct selinux_ns"; it
> doesn't have to be "selinux_state", that was just the first thing I
> could come up with.
> 
> > +static void selinux_ns_free(struct work_struct *work);
> > +
> > +int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns
> > **ns)
> > +{
> > +       struct selinux_ns *newns;
> > +       int rc;
> > +
> > +       newns = kzalloc(sizeof(*newns), GFP_KERNEL);
> > +       if (!newns)
> > +               return -ENOMEM;
> > +
> > +       refcount_set(&newns->count, 1);
> > +       INIT_WORK(&newns->work, selinux_ns_free);
> > +
> > +       rc = selinux_ss_create(&newns->ss);
> > +       if (rc)
> > +               goto err;
> > +
> > +       if (parent)
> > +               newns->parent = get_selinux_ns(parent);
> > +
> > +       *ns = newns;
> > +       return 0;
> > +err:
> > +       kfree(newns);
> > +       return rc;
> > +}
> > +
> > +static void selinux_ns_free(struct work_struct *work)
> > +{
> > +       struct selinux_ns *parent, *ns =
> > +               container_of(work, struct selinux_ns, work);
> > +
> > +       do {
> > +               parent = ns->parent;
> > +               selinux_ss_free(ns->ss);
> > +               kfree(ns);
> > +               ns = parent;
> > +       } while (ns && refcount_dec_and_test(&ns->count));
> > +}
> > +
> > +void __put_selinux_ns(struct selinux_ns *ns)
> > +{
> > +       schedule_work(&ns->work);
> > +}
> 
> While we would obviously need something like this for proper
> namespacing, with only a single state struct we don't need all the
> state/namespace management.
>
> However, I would still be willing to accept all the changes to pass
> around a reference to the global state struct, even if in this
> particular case it was always going to be single/init struct.
> 
> > @@ -6487,6 +6585,11 @@ static __init int selinux_init(void)
> > 
> >         printk(KERN_INFO "SELinux:  Initializing.\n");
> > 
> > +       if (selinux_ns_create(NULL, &init_selinux_ns))
> > +               panic("SELinux: Could not create initial
> > namespace\n");
> 
> Not yet :)  If we stick with selinux_ns_create(), see my comments
> above, at the very least we need to pick an error message that
> doesn't
> hint at namespacing.
> 
> > @@ -6629,23 +6738,32 @@ static void selinux_nf_ip_exit(void)
> >  #endif /* CONFIG_NETFILTER */
> > 
> >  #ifdef CONFIG_SECURITY_SELINUX_DISABLE
> > -static int selinux_disabled;
> > -
> > -int selinux_disable(void)
> > +int selinux_disable(struct selinux_ns *ns)
> >  {
> > -       if (ss_initialized) {
> > +       if (ns->initialized) {
> >                 /* Not permitted after initial policy load. */
> >                 return -EINVAL;
> >         }
> > 
> > -       if (selinux_disabled) {
> > +       if (ns->disabled) {
> >                 /* Only do this once. */
> >                 return -EINVAL;
> >         }
> > 
> > +       ns->disabled = 1;
> > +
> > +       /*
> > +        * Disable of a non-init ns does not disable SELinux in the
> > host.
> > +        * We simply let the disable succeed, and init will then
> > +        * unmount its selinuxfs instance and subsequent userspace
> > +        * within the ns will interpret the absence of a selinuxfs
> > mount
> > +        * as SELinux being disabled.
> > +        */
> > +       if (ns != init_selinux_ns)
> > +               return 0;
> 
> Something else we don't really need at this point.
> 
> > diff --git a/security/selinux/include/security.h
> > b/security/selinux/include/security.h
> > index 28dfb2f..b70d1dd 100644
> > --- a/security/selinux/include/security.h
> > +++ b/security/selinux/include/security.h
> > @@ -97,13 +92,80 @@ extern int
> > selinux_policycap_nnp_nosuid_transition;
> >  /* limitation of boundary depth  */
> >  #define POLICYDB_BOUNDS_MAXDEPTH       4
> > 
> > -int security_mls_enabled(void);
> > +struct selinux_ss;
> > +
> > +struct selinux_ns {
> > +       refcount_t count;
> > +       struct work_struct work;
> > +       bool disabled;
> > +#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
> > +       bool enforcing;
> > +#endif
> > +       bool checkreqprot;
> > +       bool initialized;
> > +       bool policycap[__POLICYDB_CAPABILITY_MAX];
> > +       struct selinux_ss *ss;
> 
> While I don't think we need to tackle this as part of the
> encapsulation work, this is another reminder that we should look into
> breaking the separation between the security server and the
> Linux/hooks code.  I understand there were historical reasons for
> this
> split, but I think all of those reasons are now gone, and further I
> think enough Linux'isms have crept into the security server that the
> separation is no longer as meaningful as it may have been in the
> past.

I think we want to retain some degree of encapsulation of the policy
logic and data structures, which is what the security server is
supposed to provide.  That allows us to evolve that logic and
structures without impacting the object label management and permission
enforcement code.  Separation of policy from mechanism.  I don't mind
nativizing the security server for Linux, and where appropriate,
allowing some optimization of the interfaces between it and the rest of
the SELinux code, but I wouldn't want to e.g. directly expose the
policydb to the rest of the code.  There has been some leakage of
policy awareness outside the security server in the past but I view
that as a mistake that ought to be corrected over time.

> 
> > +#define current_selinux_ns (init_selinux_ns)
> > 
> > +#define ss_initialized (current_selinux_ns->initialized)
> > +
> > +#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
> > +#define selinux_enforcing (current_selinux_ns->enforcing)
> > +#define ns_enforcing(ns) ((ns)->enforcing)
> > +#define set_ns_enforcing(ns, value) ((ns)->enforcing = value)
> > +#else
> > +#define selinux_enforcing 1
> > +#define ns_enforcing(ns) 1
> > +#define set_ns_enforcing(ns, value)
> > +#endif
> > +
> > +#define selinux_checkreqprot (current_selinux_ns->checkreqprot)
> > +
> > +#define selinux_policycap_netpeer \
> > +       (current_selinux_ns-
> > >policycap[POLICYDB_CAPABILITY_NETPEER])
> > +#define selinux_policycap_openperm \
> > +       (current_selinux_ns-
> > >policycap[POLICYDB_CAPABILITY_OPENPERM])
> > +#define selinux_policycap_extsockclass \
> > +       (current_selinux_ns-
> > >policycap[POLICYDB_CAPABILITY_EXTSOCKCLASS])
> > +#define selinux_policycap_alwaysnetwork \
> > +       (current_selinux_ns-
> > >policycap[POLICYDB_CAPABILITY_ALWAYSNETWORK])
> > +#define selinux_policycap_cgroupseclabel \
> > +       (current_selinux_ns-
> > >policycap[POLICYDB_CAPABILITY_CGROUPSECLABEL])
> > +#define selinux_policycap_nnp_nosuid_transition \
> > +       (current_selinux_ns-
> > >policycap[POLICYDB_CAPABILITY_NNP_NOSUID_TRANSITION])
> 
> I realize this was likely the quickest solution, but I think I would
> prefer to see stuff like the above written as small static inline
> functions.
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 01/10] selinux: introduce a selinux namespace
  2018-02-07 17:48     ` Stephen Smalley
@ 2018-02-07 19:56       ` Paul Moore
  2018-02-08 15:02         ` Stephen Smalley
  0 siblings, 1 reply; 39+ messages in thread
From: Paul Moore @ 2018-02-07 19:56 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, James Morris

On Wed, Feb 7, 2018 at 12:48 PM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On Tue, 2018-02-06 at 17:18 -0500, Paul Moore wrote:

...

>> While I don't think we need to tackle this as part of the
>> encapsulation work, this is another reminder that we should look into
>> breaking the separation between the security server and the
>> Linux/hooks code.  I understand there were historical reasons for
>> this
>> split, but I think all of those reasons are now gone, and further I
>> think enough Linux'isms have crept into the security server that the
>> separation is no longer as meaningful as it may have been in the
>> past.
>
> I think we want to retain some degree of encapsulation of the policy
> logic and data structures, which is what the security server is
> supposed to provide.  That allows us to evolve that logic and
> structures without impacting the object label management and permission
> enforcement code.  Separation of policy from mechanism.  I don't mind
> nativizing the security server for Linux, and where appropriate,
> allowing some optimization of the interfaces between it and the rest of
> the SELinux code, but I wouldn't want to e.g. directly expose the
> policydb to the rest of the code.  There has been some leakage of
> policy awareness outside the security server in the past but I view
> that as a mistake that ought to be corrected over time.

I agree that a level of abstraction between the policydb code and the
enforcement code is a good thing, but I think there are some
boundaries between the hook code and the security server that we could
do without.  Once again, not really part of this work, just popped up
in my head again while looking at these patches.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 01/10] selinux: introduce a selinux namespace
  2018-02-07 19:56       ` Paul Moore
@ 2018-02-08 15:02         ` Stephen Smalley
  2018-02-08 21:41           ` Paul Moore
  0 siblings, 1 reply; 39+ messages in thread
From: Stephen Smalley @ 2018-02-08 15:02 UTC (permalink / raw)
  To: Paul Moore; +Cc: selinux, James Morris

On Wed, 2018-02-07 at 14:56 -0500, Paul Moore wrote:
> On Wed, Feb 7, 2018 at 12:48 PM, Stephen Smalley <sds@tycho.nsa.gov>
> wrote:
> > On Tue, 2018-02-06 at 17:18 -0500, Paul Moore wrote:
> 
> ...
> 
> > > While I don't think we need to tackle this as part of the
> > > encapsulation work, this is another reminder that we should look
> > > into
> > > breaking the separation between the security server and the
> > > Linux/hooks code.  I understand there were historical reasons for
> > > this
> > > split, but I think all of those reasons are now gone, and further
> > > I
> > > think enough Linux'isms have crept into the security server that
> > > the
> > > separation is no longer as meaningful as it may have been in the
> > > past.
> > 
> > I think we want to retain some degree of encapsulation of the
> > policy
> > logic and data structures, which is what the security server is
> > supposed to provide.  That allows us to evolve that logic and
> > structures without impacting the object label management and
> > permission
> > enforcement code.  Separation of policy from mechanism.  I don't
> > mind
> > nativizing the security server for Linux, and where appropriate,
> > allowing some optimization of the interfaces between it and the
> > rest of
> > the SELinux code, but I wouldn't want to e.g. directly expose the
> > policydb to the rest of the code.  There has been some leakage of
> > policy awareness outside the security server in the past but I view
> > that as a mistake that ought to be corrected over time.
> 
> I agree that a level of abstraction between the policydb code and the
> enforcement code is a good thing, but I think there are some
> boundaries between the hook code and the security server that we
> could
> do without.  Once again, not really part of this work, just popped up
> in my head again while looking at these patches.

I wanted to clarify that point because it motivates keeping selinux_ss
as a separate struct from selinux_state (previously selinux_ns), and
not exposing the selinux_ss struct definition outside of the security
server.  So the selinux_state struct would still be something like:
struct selinux_state {
        bool disabled;
#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
        bool enforcing;
#endif
        bool checkreqprot;
        bool initialized;
        bool policycap[__POLICYDB_CAPABILITY_MAX];
        struct selinux_avc *avc;
        struct selinux_ss *ss;
}

(dropping the reference count, work struct, and parent pointers that
were part of selinux_ns as being specific to the namespace work).

hooks.c would declare struct selinux_state selinux_state;.
ss/services.c would declare static struct selinux_ss selinux_ss; and
provide a selinux_ss_init() function (instead of selinux_ss_create)
that would set the ->ss field to &selinux_ss.  Likewise for avc.c and
the selinux_avc.  hooks.c:selinux_init() would call
selinux_ss_init(&selinux_state.ss) and
selinux_avc_init(&selinux_state.avc) to set those pointers to the
appropriate structures private to the ss and avc code.  The _create and
_free functions would go away and none of the structures would be
dynamically allocated/freed.

This is in contrast to exposing the selinux_ss struct definition
outside the security server and directly embedding it in the
selinux_state, since that would require exposing the policydb and
sidtab structures as well (unless we were to make those opaque pointers
within the selinux_ss; currently the policydb and sidtab are directly
embedded within it).  Likewise for the selinux_avc and its embedded
avc_cache structure.

Trying to make sure we're in agreement on the data structures before I
rewrite since I don't want to have to rewrite it twice ;)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [RFC 01/10] selinux: introduce a selinux namespace
  2018-02-08 15:02         ` Stephen Smalley
@ 2018-02-08 21:41           ` Paul Moore
  0 siblings, 0 replies; 39+ messages in thread
From: Paul Moore @ 2018-02-08 21:41 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: selinux, James Morris

On Thu, Feb 8, 2018 at 10:02 AM, Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On Wed, 2018-02-07 at 14:56 -0500, Paul Moore wrote:
>> On Wed, Feb 7, 2018 at 12:48 PM, Stephen Smalley <sds@tycho.nsa.gov>
>> wrote:
>> > On Tue, 2018-02-06 at 17:18 -0500, Paul Moore wrote:
>>
>> ...
>>
>> > > While I don't think we need to tackle this as part of the
>> > > encapsulation work, this is another reminder that we should look
>> > > into
>> > > breaking the separation between the security server and the
>> > > Linux/hooks code.  I understand there were historical reasons for
>> > > this
>> > > split, but I think all of those reasons are now gone, and further
>> > > I
>> > > think enough Linux'isms have crept into the security server that
>> > > the
>> > > separation is no longer as meaningful as it may have been in the
>> > > past.
>> >
>> > I think we want to retain some degree of encapsulation of the
>> > policy
>> > logic and data structures, which is what the security server is
>> > supposed to provide.  That allows us to evolve that logic and
>> > structures without impacting the object label management and
>> > permission
>> > enforcement code.  Separation of policy from mechanism.  I don't
>> > mind
>> > nativizing the security server for Linux, and where appropriate,
>> > allowing some optimization of the interfaces between it and the
>> > rest of
>> > the SELinux code, but I wouldn't want to e.g. directly expose the
>> > policydb to the rest of the code.  There has been some leakage of
>> > policy awareness outside the security server in the past but I view
>> > that as a mistake that ought to be corrected over time.
>>
>> I agree that a level of abstraction between the policydb code and the
>> enforcement code is a good thing, but I think there are some
>> boundaries between the hook code and the security server that we
>> could
>> do without.  Once again, not really part of this work, just popped up
>> in my head again while looking at these patches.
>
> I wanted to clarify that point because it motivates keeping selinux_ss
> as a separate struct from selinux_state (previously selinux_ns), and
> not exposing the selinux_ss struct definition outside of the security
> server.  So the selinux_state struct would still be something like:
> struct selinux_state {
>         bool disabled;
> #ifdef CONFIG_SECURITY_SELINUX_DEVELOP
>         bool enforcing;
> #endif
>         bool checkreqprot;
>         bool initialized;
>         bool policycap[__POLICYDB_CAPABILITY_MAX];
>         struct selinux_avc *avc;
>         struct selinux_ss *ss;
> }
>
> (dropping the reference count, work struct, and parent pointers that
> were part of selinux_ns as being specific to the namespace work).

I think keeping the selinux_avc and selinux_ss structs separate is
fine right now, as I mentioned earlier, I really don't want to
entangle the encapsulation work with any potential hooks/ss
consolidation work.

Directly embedding the selinux_avc and selinux_ss structures inside
selinux_state would be nice as we could just do one allocation, but
that isn't worth breaking the abstraction at this point.  Maybe we
will get there at some point, but I think it's important to stay
focused on the task at hand.

> hooks.c would declare struct selinux_state selinux_state;.
> ss/services.c would declare static struct selinux_ss selinux_ss; and
> provide a selinux_ss_init() function (instead of selinux_ss_create)
> that would set the ->ss field to &selinux_ss.  Likewise for avc.c and
> the selinux_avc.  hooks.c:selinux_init() would call
> selinux_ss_init(&selinux_state.ss) and
> selinux_avc_init(&selinux_state.avc) to set those pointers to the
> appropriate structures private to the ss and avc code.  The _create and
> _free functions would go away and none of the structures would be
> dynamically allocated/freed.
>
> This is in contrast to exposing the selinux_ss struct definition
> outside the security server and directly embedding it in the
> selinux_state, since that would require exposing the policydb and
> sidtab structures as well (unless we were to make those opaque pointers
> within the selinux_ss; currently the policydb and sidtab are directly
> embedded within it).  Likewise for the selinux_avc and its embedded
> avc_cache structure.
>
> Trying to make sure we're in agreement on the data structures before I
> rewrite since I don't want to have to rewrite it twice ;)

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2018-02-08 21:41 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-02 15:58 [RFC 00/10] Introduce a SELinux namespace Stephen Smalley
2017-10-02 15:58 ` [RFC 01/10] selinux: introduce a selinux namespace Stephen Smalley
2018-02-06 22:18   ` Paul Moore
2018-02-07 16:17     ` Paul Moore
2018-02-07 17:48     ` Stephen Smalley
2018-02-07 19:56       ` Paul Moore
2018-02-08 15:02         ` Stephen Smalley
2018-02-08 21:41           ` Paul Moore
2017-10-02 15:58 ` [RFC 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
2017-10-02 15:58 ` [RFC 03/10] selinux: move the AVC into the selinux namespace Stephen Smalley
2017-10-09  3:10   ` James Morris
2017-10-10 14:35     ` Stephen Smalley
2017-10-02 15:58 ` [RFC 04/10] netns, selinux: create the selinux netlink socket per network namespace Stephen Smalley
2017-10-05  5:47   ` Serge E. Hallyn
2017-10-05 14:06     ` Stephen Smalley
2017-10-05 14:11       ` Stephen Smalley
2017-10-29  3:16       ` Serge E. Hallyn
2017-10-06  1:07   ` James Morris
2017-10-06 13:21     ` Stephen Smalley
2017-10-06 19:24       ` Serge E. Hallyn
2017-10-10 14:35         ` Stephen Smalley
2017-10-02 15:58 ` [RFC 05/10] selinux: support per-task/cred selinux namespace Stephen Smalley
2017-10-06  1:14   ` James Morris
2017-10-06 19:25     ` Serge E. Hallyn
2017-10-08 22:08       ` James Morris
2017-10-02 15:58 ` [RFC 06/10] selinux: introduce cred_selinux_ns() and use it Stephen Smalley
2017-10-02 15:58 ` [RFC 07/10] selinux: support per-namespace inode security structures Stephen Smalley
2017-10-02 15:58 ` [RFC 08/10] selinux: support per-namespace superblock " Stephen Smalley
2017-10-02 15:58 ` [RFC 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
2017-10-02 23:56   ` Casey Schaufler
2017-10-03 12:29     ` Stephen Smalley
2017-10-03 17:14       ` Casey Schaufler
2017-10-05 15:27   ` Stephen Smalley
2017-10-05 15:49     ` Stephen Smalley
2017-10-05 17:04       ` Stephen Smalley
2017-10-09  1:52     ` James Morris
     [not found]       ` <CAB9W1A2-PT8QU-md1s9fxhNg+Cv0C4Xu-i1w_q0XzQ+K9rsyAg@mail.gmail.com>
2017-10-09 13:53         ` Stephen Smalley
2017-10-09 23:04           ` James Morris
2017-10-02 15:58 ` [RFC 10/10] selinuxfs: restrict write operations to the same " Stephen Smalley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.