selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/10] SELinux namespace series, re-based
@ 2019-10-15 13:25 Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

After a long hiatus, I have re-based the SELinux namespace series
on top of selinux/next based on v5.4-rc1, available from
https://github.com/stephensmalley/selinux-kernel/tree/selinuxns-v5.4-rc1
and posted here.  Thanks to Paul Moore who had earlier ported the
series up through v4.19-rc1.

I chose to drop the per-namespace inode and superblock security blob
patches from the series for the time being.  In part, this is due to the
fact that the original patch for per-namespace inode security blobs
requires a major rewrite to deal with the LSM stacking changes.
However, even apart from this issue, those two patches had known major
problems that made them unlikely in my view to survive in the final
implementation.  This does leave the series in an even less functional
state than before.

This series also does not include James Morris' separate RFC patch for
per-namespace security extended attributes on files,
https://patchwork.kernel.org/patch/10067875/
which would ultimately be needed if we want to fully support per-namespace
file security labels (not merely mappings of a single file label).

As before, this is unsafe, experimental code.  Use at your own risk.
The patches should be harmless no-ops up until the one that introduces
the ability to unshare the selinux namespace, but YMMV.

James Morris (1):
  selinuxns: mark init_selinux_ns as __ro_after_init

Peter Enderborg (1):
  selinux: Annotate lockdep for services locks

Stephen Smalley (8):
  selinux: rename selinux state to ns (namespace)
  selinux: support multiple selinuxfs instances
  selinux: dynamically allocate selinux namespace
  netns,selinux: create the selinux netlink socket per network namespace
  selinux: support per-task/cred selinux namespace
  selinux: introduce cred_selinux_ns() and use it
  selinux: add a selinuxfs interface to unshare selinux namespace
  selinuxfs: restrict write operations to the same selinux namespace

 include/net/net_namespace.h            |   3 +
 security/selinux/avc.c                 |  94 +++--
 security/selinux/hooks.c               | 512 +++++++++++++----------
 security/selinux/ibpkey.c              |   2 +-
 security/selinux/include/avc.h         |  16 +-
 security/selinux/include/classmap.h    |   3 +-
 security/selinux/include/conditional.h |   6 +-
 security/selinux/include/objsec.h      |  23 --
 security/selinux/include/security.h    | 185 ++++++---
 security/selinux/netif.c               |   2 +-
 security/selinux/netlabel.c            |  12 +-
 security/selinux/netlink.c             |  31 +-
 security/selinux/netnode.c             |   4 +-
 security/selinux/netport.c             |   2 +-
 security/selinux/selinuxfs.c           | 266 ++++++++----
 security/selinux/ss/services.c         | 543 +++++++++++++------------
 security/selinux/ss/status.c           |  42 +-
 security/selinux/xfrm.c                |  18 +-
 18 files changed, 1026 insertions(+), 738 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 02/10] selinux: support multiple selinuxfs instances
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 03/10] selinux: dynamically allocate selinux namespace Stephen Smalley
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

Support multiple selinuxfs instances, one per selinux namespace.

The expected usage would be to unshare the SELinux namespace and
the mount namespace, and then mount a new selinuxfs instance.  The
new instance would then provide an interface for viewing and manipulating
the state of the new SELinux namespace and would not affect the parent
namespace in any manner.

This change by itself should have no effect on SELinux behavior or
APIs (userspace or LSM).

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/selinuxfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 3873946f4dd8..a69381f94d37 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -2006,7 +2006,7 @@ static int sel_fill_super(struct super_block *sb, struct fs_context *fc)
 
 static int sel_get_tree(struct fs_context *fc)
 {
-	return get_tree_single(fc, sel_fill_super);
+	return get_tree_keyed(fc, sel_fill_super, current_selinux_ns);
 }
 
 static const struct fs_context_operations sel_context_ops = {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 03/10] selinux: dynamically allocate selinux namespace
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 04/10] selinuxns: mark init_selinux_ns as __ro_after_init Stephen Smalley
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

Move from static allocation of a single selinux namespace to
dynamic allocation.  Include necessary support for lifecycle management
of the selinux namespace, modeled after the user namespace support.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/avc.c              | 32 ++++++++++----
 security/selinux/hooks.c            | 68 ++++++++++++++++++++++++++---
 security/selinux/include/security.h | 25 ++++++++++-
 security/selinux/selinuxfs.c        |  3 +-
 security/selinux/ss/services.c      | 25 ++++++++---
 5 files changed, 129 insertions(+), 24 deletions(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index c53582acd99f..d2cf749b5f19 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -88,20 +88,34 @@ struct selinux_avc {
 	struct avc_cache avc_cache;
 };
 
-static struct selinux_avc selinux_avc;
-
-void selinux_avc_init(struct selinux_avc **avc)
+int selinux_avc_create(struct selinux_avc **avc)
 {
+	struct selinux_avc *newavc;
 	int i;
 
-	selinux_avc.avc_cache_threshold = AVC_DEF_CACHE_THRESHOLD;
+	newavc = kzalloc(sizeof(*newavc), GFP_KERNEL);
+	if (!newavc)
+		return -ENOMEM;
+
+	newavc->avc_cache_threshold = AVC_DEF_CACHE_THRESHOLD;
+
 	for (i = 0; i < AVC_CACHE_SLOTS; i++) {
-		INIT_HLIST_HEAD(&selinux_avc.avc_cache.slots[i]);
-		spin_lock_init(&selinux_avc.avc_cache.slots_lock[i]);
+		INIT_HLIST_HEAD(&newavc->avc_cache.slots[i]);
+		spin_lock_init(&newavc->avc_cache.slots_lock[i]);
 	}
-	atomic_set(&selinux_avc.avc_cache.active_nodes, 0);
-	atomic_set(&selinux_avc.avc_cache.lru_hint, 0);
-	*avc = &selinux_avc;
+	atomic_set(&newavc->avc_cache.active_nodes, 0);
+	atomic_set(&newavc->avc_cache.lru_hint, 0);
+
+	*avc = newavc;
+	return 0;
+}
+
+static void avc_flush(struct selinux_avc *avc);
+
+void selinux_avc_free(struct selinux_avc *avc)
+{
+	avc_flush(avc);
+	kfree(avc);
 }
 
 unsigned int avc_get_cache_threshold(struct selinux_avc *avc)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 6292fdd2b01f..7a4ed553cec0 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -103,7 +103,6 @@
 #include "audit.h"
 #include "avc_ss.h"
 
-static struct selinux_ns init_selinux_ns;
 struct selinux_ns *current_selinux_ns;
 
 /* SECMARK reference count */
@@ -7045,15 +7044,70 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 #endif
 };
 
+static void selinux_ns_free(struct work_struct *work);
+
+int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns)
+{
+	struct selinux_ns *newns;
+	int rc;
+
+	newns = kzalloc(sizeof(*newns), GFP_KERNEL);
+	if (!newns)
+		return -ENOMEM;
+
+	refcount_set(&newns->count, 1);
+	INIT_WORK(&newns->work, selinux_ns_free);
+
+	rc = selinux_ss_create(&newns->ss);
+	if (rc)
+		goto err;
+
+	rc = selinux_avc_create(&newns->avc);
+	if (rc)
+		goto err;
+
+	if (parent)
+		newns->parent = get_selinux_ns(parent);
+
+	*ns = newns;
+	return 0;
+err:
+	selinux_ss_free(newns->ss);
+	kfree(newns);
+	return rc;
+}
+
+static void selinux_ns_free(struct work_struct *work)
+{
+	struct selinux_ns *parent, *ns =
+		container_of(work, struct selinux_ns, work);
+
+	do {
+		parent = ns->parent;
+		selinux_ss_free(ns->ss);
+		selinux_avc_free(ns->avc);
+		kfree(ns);
+		ns = parent;
+	} while (ns && refcount_dec_and_test(&ns->count));
+}
+
+void __put_selinux_ns(struct selinux_ns *ns)
+{
+	schedule_work(&ns->work);
+}
+
+static struct selinux_ns *init_selinux_ns;
+
 static __init int selinux_init(void)
 {
 	pr_info("SELinux:  Initializing.\n");
 
-	enforcing_set(&init_selinux_ns, selinux_enforcing_boot);
-	init_selinux_ns.checkreqprot = selinux_checkreqprot_boot;
-	selinux_ss_init(&init_selinux_ns.ss);
-	selinux_avc_init(&init_selinux_ns.avc);
-	current_selinux_ns = &init_selinux_ns;
+	if (selinux_ns_create(NULL, &init_selinux_ns))
+		panic("SELinux: Could not create initial namespace\n");
+
+	enforcing_set(init_selinux_ns, selinux_enforcing_boot);
+	init_selinux_ns->checkreqprot = selinux_checkreqprot_boot;
+	current_selinux_ns = init_selinux_ns;
 
 	/* Set the security state for the initial task. */
 	cred_init_security();
@@ -7226,7 +7280,7 @@ int selinux_disable(struct selinux_ns *ns)
 	 * within the ns will interpret the absence of a selinuxfs mount
 	 * as SELinux being disabled.
 	 */
-	if (ns != &init_selinux_ns)
+	if (ns != init_selinux_ns)
 		return 0;
 
 	pr_info("SELinux:  Disabled at runtime.\n");
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index e22e281992d8..971fd5f53b6e 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -99,6 +99,8 @@ struct selinux_avc;
 struct selinux_ss;
 
 struct selinux_ns {
+	refcount_t count;
+	struct work_struct work;
 	bool disabled;
 #ifdef CONFIG_SECURITY_SELINUX_DEVELOP
 	bool enforcing;
@@ -108,10 +110,29 @@ struct selinux_ns {
 	bool policycap[__POLICYDB_CAPABILITY_MAX];
 	struct selinux_avc *avc;
 	struct selinux_ss *ss;
+	struct selinux_ns *parent;
 };
 
-void selinux_ss_init(struct selinux_ss **ss);
-void selinux_avc_init(struct selinux_avc **avc);
+int selinux_ns_create(struct selinux_ns *parent, struct selinux_ns **ns);
+void __put_selinux_ns(struct selinux_ns *ns);
+
+int selinux_ss_create(struct selinux_ss **ss);
+void selinux_ss_free(struct selinux_ss *ss);
+
+int selinux_avc_create(struct selinux_avc **avc);
+void selinux_avc_free(struct selinux_avc *avc);
+
+static inline void put_selinux_ns(struct selinux_ns *ns)
+{
+	if (ns && refcount_dec_and_test(&ns->count))
+		__put_selinux_ns(ns);
+}
+
+static inline struct selinux_ns *get_selinux_ns(struct selinux_ns *ns)
+{
+	refcount_inc(&ns->count);
+	return ns;
+}
 
 extern struct selinux_ns *current_selinux_ns;
 
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index a69381f94d37..41270a783cf5 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -90,7 +90,7 @@ static int selinux_fs_info_create(struct super_block *sb)
 
 	mutex_init(&fsi->mutex);
 	fsi->last_ino = SEL_INO_NEXT - 1;
-	fsi->ns = current_selinux_ns;
+	fsi->ns = get_selinux_ns(current_selinux_ns);
 	fsi->sb = sb;
 	sb->s_fs_info = fsi;
 	return 0;
@@ -102,6 +102,7 @@ static void selinux_fs_info_free(struct super_block *sb)
 	int i;
 
 	if (fsi) {
+		put_selinux_ns(fsi->ns);
 		for (i = 0; i < fsi->bool_num; i++)
 			kfree(fsi->bool_pending_names[i]);
 		kfree(fsi->bool_pending_names);
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index defc70ac9fd1..57d6b8dddc54 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -76,13 +76,28 @@ const char *selinux_policycap_names[__POLICYDB_CAPABILITY_MAX] = {
 	"nnp_nosuid_transition"
 };
 
-static struct selinux_ss selinux_ss;
+int selinux_ss_create(struct selinux_ss **ss)
+{
+	struct selinux_ss *newss;
+
+	newss = kzalloc(sizeof(*newss), GFP_KERNEL);
+	if (!newss)
+		return -ENOMEM;
+	rwlock_init(&newss->policy_rwlock);
+	mutex_init(&newss->status_lock);
+	*ss = newss;
+	return 0;
+}
 
-void selinux_ss_init(struct selinux_ss **ss)
+void selinux_ss_free(struct selinux_ss *ss)
 {
-	rwlock_init(&selinux_ss.policy_rwlock);
-	mutex_init(&selinux_ss.status_lock);
-	*ss = &selinux_ss;
+	if (ss->sidtab)
+		sidtab_destroy(ss->sidtab);
+	policydb_destroy(&ss->policydb);
+	kfree(ss->map.mapping);
+	if (ss->status_page)
+		__free_page(ss->status_page);
+	kfree(ss);
 }
 
 /* Forward declaration. */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 04/10] selinuxns: mark init_selinux_ns as __ro_after_init
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 03/10] selinux: dynamically allocate selinux namespace Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 05/10] selinux: Annotate lockdep for services locks Stephen Smalley
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, James Morris, Stephen Smalley

From: James Morris <james.l.morris@oracle.com>

This is a patch against the SELinux namespace work.

Mark the initial SELinux namespace pointer as __ro_after_init, to harden
against malicious overwrite by an attacker.

Signed-off-by: James Morris <james.l.morris@oracle.com>
[sds@tycho.nsa.gov: ported to v5.4-rc1]
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 7a4ed553cec0..dc0b143ffa55 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -7096,7 +7096,7 @@ void __put_selinux_ns(struct selinux_ns *ns)
 	schedule_work(&ns->work);
 }
 
-static struct selinux_ns *init_selinux_ns;
+static struct selinux_ns *init_selinux_ns  __ro_after_init;
 
 static __init int selinux_init(void)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 05/10] selinux: Annotate lockdep for services locks
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (2 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 04/10] selinuxns: mark init_selinux_ns as __ro_after_init Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 06/10] netns,selinux: create the selinux netlink socket per network namespace Stephen Smalley
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Peter Enderborg, Stephen Smalley

From: Peter Enderborg <peter.enderborg@sony.com>

The locks are moved to dynamic allocation, we need to
help the lockdep system to classify the locks.
This adds to lockdep annotation for the page mutex and
for the ss lock.

Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
[sds@tycho.nsa.gov: ported to v5.4-rc1]
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/ss/services.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 57d6b8dddc54..9404a4494c7c 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -66,6 +66,9 @@
 #include "ebitmap.h"
 #include "audit.h"
 
+static struct lock_class_key selinux_ss_class_key;
+static struct lock_class_key selinux_status_class_key;
+
 /* Policy capability names */
 const char *selinux_policycap_names[__POLICYDB_CAPABILITY_MAX] = {
 	"network_peer_controls",
@@ -84,7 +87,9 @@ int selinux_ss_create(struct selinux_ss **ss)
 	if (!newss)
 		return -ENOMEM;
 	rwlock_init(&newss->policy_rwlock);
+	lockdep_set_class(&newss->policy_rwlock, &selinux_ss_class_key);
 	mutex_init(&newss->status_lock);
+	lockdep_set_class(&newss->status_lock, &selinux_status_class_key);
 	*ss = newss;
 	return 0;
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 06/10] netns,selinux: create the selinux netlink socket per network namespace
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (3 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 05/10] selinux: Annotate lockdep for services locks Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 07/10] selinux: support per-task/cred selinux namespace Stephen Smalley
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

The selinux netlink socket is used to notify userspace of changes to
the enforcing mode and policy reloads.  At present, these notifications
are always sent to the initial network namespace.  In order to support
multiple selinux namespaces, each with its own enforcing mode and
policy, we need to create and use a separate selinux netlink socket
for each network namespace.

Without this change, a policy reload in a child selinux namespace
causes a notification to be sent to processes in the init namespace
with a sequence number that may be higher than the policy sequence
number for that namespace.  As a result, userspace AVC instances in
the init namespace will then end up rejecting any further access
vector results from its own security server instance due to the
policy sequence number appearing to regress, which in turn causes
all subsequent uncached access checks to fail.  Similarly,
without this change, changing enforcing mode in the child selinux
namespace triggers a notification to all userspace AVC instances
in the init namespace that will switch their enforcing modes.

This change does alter SELinux behavior, since previously reloading
policy or changing enforcing mode in a non-init network namespace would
trigger a notification to processes in the init network namespace.
However, this behavior is not being relied upon by existing userspace
AFAICT and is arguably wrong regardless.

This change presumes that one will always unshare the network namespace
when unsharing a new selinux namespace (the reverse is not required).
Otherwise, the same inconsistencies could arise between the notifications
and the relevant policy.  At present, nothing enforces this guarantee
at the kernel level; it is left up to userspace (e.g. container runtimes).
It is an open question as to whether this is a good idea or whether
unsharing of the selinux namespace should automatically unshare the network
namespace.  However, keeping them separate is consistent with the handling
of the mount namespace currently, which also should be unshared so that
a private selinuxfs mount can be created.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 include/net/net_namespace.h |  3 +++
 security/selinux/netlink.c  | 31 +++++++++++++++++++++++++------
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index f8712bbeb2e0..df0737725454 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -172,6 +172,9 @@ struct net {
 #endif
 	struct sock		*diag_nlsk;
 	atomic_t		fnhe_genid;
+#if IS_ENABLED(CONFIG_SECURITY_SELINUX)
+	struct sock		*selnl;
+#endif
 } __randomize_layout;
 
 #include <linux/seq_file_net.h>
diff --git a/security/selinux/netlink.c b/security/selinux/netlink.c
index 621e2e9cd6a1..03678a76f4bb 100644
--- a/security/selinux/netlink.c
+++ b/security/selinux/netlink.c
@@ -19,8 +19,6 @@
 
 #include "security.h"
 
-static struct sock *selnl;
-
 static int selnl_msglen(int msgtype)
 {
 	int ret = 0;
@@ -66,6 +64,7 @@ static void selnl_add_payload(struct nlmsghdr *nlh, int len, int msgtype, void *
 
 static void selnl_notify(int msgtype, void *data)
 {
+	struct sock *selnl = current->nsproxy->net_ns->selnl;
 	int len;
 	sk_buff_data_t tmp;
 	struct sk_buff *skb;
@@ -105,16 +104,36 @@ void selnl_notify_policyload(u32 seqno)
 	selnl_notify(SELNL_MSG_POLICYLOAD, &seqno);
 }
 
-static int __init selnl_init(void)
+static int __net_init selnl_net_init(struct net *net)
 {
+	struct sock *sk;
 	struct netlink_kernel_cfg cfg = {
 		.groups	= SELNLGRP_MAX,
 		.flags	= NL_CFG_F_NONROOT_RECV,
 	};
 
-	selnl = netlink_kernel_create(&init_net, NETLINK_SELINUX, &cfg);
-	if (selnl == NULL)
-		panic("SELinux:  Cannot create netlink socket.");
+	sk = netlink_kernel_create(net, NETLINK_SELINUX, &cfg);
+	if (!sk)
+		return -ENOMEM;
+	net->selnl = sk;
+	return 0;
+}
+
+static void __net_exit selnl_net_exit(struct net *net)
+{
+	netlink_kernel_release(net->selnl);
+	net->selnl = NULL;
+}
+
+static struct pernet_operations selnl_net_ops = {
+	.init = selnl_net_init,
+	.exit = selnl_net_exit,
+};
+
+static int __init selnl_init(void)
+{
+	if (register_pernet_subsys(&selnl_net_ops))
+		panic("Could not register selinux netlink operations\n");
 	return 0;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 07/10] selinux: support per-task/cred selinux namespace
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (4 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 06/10] netns,selinux: create the selinux netlink socket per network namespace Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 08/10] selinux: introduce cred_selinux_ns() and use it Stephen Smalley
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

Extend the task security structure to include a reference to
the associated selinux namespace, and to also contain a
pointer to the cred in the parent namespace.  The current selinux
namespace is changed to the per-task/cred selinux namespace
for the current task/cred.

This change makes it possible to support per-cred selinux namespaces,
but does not yet introduce a mechanism for unsharing of the selinux
namespace.  Thus, by itself, this change does not alter the existing
situation with respect to all processes still using a single init
selinux namespace.

An alternative would be to hang the selinux namespace off of the
user namespace, which itself is associated with the cred.  This
seems undesirable however since DAC and MAC are orthogonal, and
there appear to be real use cases where one will want to use selinux
namespaces without user namespaces and vice versa. However, one
advantage of hanging off the user namespace would be that it is already
associated with other namespaces, such as the network namespace, thus
potentially facilitating looking up the relevant selinux namespace from
the network input/forward hooks.  In most cases however, it appears that
we could instead copy a reference to the creating task selinux namespace
to sock security structures and use that in those hooks.

Introduce a task_security() helper to obtain the correct task/cred
security structure from the hooks, and update the hooks to use it.
This returns a pointer to the security structure for the task in
the same selinux namespace as the caller, or if there is none, a
fake security structure with the well-defined unlabeled SIDs.  This
ensures that we return a valid result that can be used for permission
checks and for returning contexts from e.g. reading /proc/pid/attr files.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c            | 51 +++++++++++++++++++++++++----
 security/selinux/include/objsec.h   | 23 -------------
 security/selinux/include/security.h | 32 +++++++++++++++++-
 3 files changed, 75 insertions(+), 31 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index dc0b143ffa55..28cc75e5361b 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -103,8 +103,6 @@
 #include "audit.h"
 #include "avc_ss.h"
 
-struct selinux_ns *current_selinux_ns;
-
 /* SECMARK reference count */
 static atomic_t selinux_secmark_refcount = ATOMIC_INIT(0);
 
@@ -202,6 +200,8 @@ static int selinux_lsm_notifier_avc_callback(u32 event)
 	return 0;
 }
 
+static struct selinux_ns *init_selinux_ns  __ro_after_init;
+
 /*
  * initialise the security for the init task
  */
@@ -212,6 +212,7 @@ static void cred_init_security(void)
 
 	tsec = selinux_cred(cred);
 	tsec->osid = tsec->sid = SECINITSID_KERNEL;
+	tsec->ns = get_selinux_ns(init_selinux_ns);
 }
 
 /*
@@ -225,15 +226,35 @@ static inline u32 cred_sid(const struct cred *cred)
 	return tsec->sid;
 }
 
+static struct task_security_struct unlabeled_task_security = {
+	.osid = SECINITSID_UNLABELED,
+	.sid = SECINITSID_UNLABELED,
+};
+
+static const struct task_security_struct *task_security(
+	const struct task_struct *p)
+{
+	const struct task_security_struct *tsec;
+
+	tsec = selinux_cred(__task_cred(p));
+	while (tsec->ns != current_selinux_ns && tsec->parent_cred)
+		tsec = selinux_cred(tsec->parent_cred);
+	if (tsec->ns != current_selinux_ns)
+		return &unlabeled_task_security;
+	return tsec;
+}
+
 /*
  * get the objective security ID of a task
  */
 static inline u32 task_sid(const struct task_struct *task)
 {
+	const struct task_security_struct *tsec;
 	u32 sid;
 
 	rcu_read_lock();
-	sid = cred_sid(__task_cred(task));
+	tsec = task_security(task);
+	sid = tsec->sid;
 	rcu_read_unlock();
 	return sid;
 }
@@ -3889,6 +3910,18 @@ static int selinux_task_alloc(struct task_struct *task,
 			    sid, sid, SECCLASS_PROCESS, PROCESS__FORK, NULL);
 }
 
+/*
+ * free/release any cred memory other than the blob itself
+ */
+static void selinux_cred_free(struct cred *cred)
+{
+	struct task_security_struct *tsec = selinux_cred(cred);
+
+	put_selinux_ns(tsec->ns);
+	if (tsec->parent_cred)
+		put_cred(tsec->parent_cred);
+}
+
 /*
  * prepare a new set of credentials for modification
  */
@@ -3899,6 +3932,9 @@ static int selinux_cred_prepare(struct cred *new, const struct cred *old,
 	struct task_security_struct *tsec = selinux_cred(new);
 
 	*tsec = *old_tsec;
+	tsec->ns = get_selinux_ns(old_tsec->ns);
+	if (old_tsec->parent_cred)
+		tsec->parent_cred = get_cred(old_tsec->parent_cred);
 	return 0;
 }
 
@@ -3911,6 +3947,9 @@ static void selinux_cred_transfer(struct cred *new, const struct cred *old)
 	struct task_security_struct *tsec = selinux_cred(new);
 
 	*tsec = *old_tsec;
+	tsec->ns = get_selinux_ns(old_tsec->ns);
+	if (old_tsec->parent_cred)
+		tsec->parent_cred = get_cred(old_tsec->parent_cred);
 }
 
 static void selinux_cred_getsecid(const struct cred *c, u32 *secid)
@@ -6280,7 +6319,7 @@ static int selinux_getprocattr(struct task_struct *p,
 	unsigned len;
 
 	rcu_read_lock();
-	__tsec = selinux_cred(__task_cred(p));
+	__tsec = task_security(p);
 
 	if (current != p) {
 		error = avc_has_perm(current_selinux_ns,
@@ -6895,6 +6934,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(file_open, selinux_file_open),
 
 	LSM_HOOK_INIT(task_alloc, selinux_task_alloc),
+	LSM_HOOK_INIT(cred_free, selinux_cred_free),
 	LSM_HOOK_INIT(cred_prepare, selinux_cred_prepare),
 	LSM_HOOK_INIT(cred_transfer, selinux_cred_transfer),
 	LSM_HOOK_INIT(cred_getsecid, selinux_cred_getsecid),
@@ -7096,8 +7136,6 @@ void __put_selinux_ns(struct selinux_ns *ns)
 	schedule_work(&ns->work);
 }
 
-static struct selinux_ns *init_selinux_ns  __ro_after_init;
-
 static __init int selinux_init(void)
 {
 	pr_info("SELinux:  Initializing.\n");
@@ -7107,7 +7145,6 @@ static __init int selinux_init(void)
 
 	enforcing_set(init_selinux_ns, selinux_enforcing_boot);
 	init_selinux_ns->checkreqprot = selinux_checkreqprot_boot;
-	current_selinux_ns = init_selinux_ns;
 
 	/* Set the security state for the initial task. */
 	cred_init_security();
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 586b7abd0aa7..23188a47474f 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -28,15 +28,6 @@
 #include "flask.h"
 #include "avc.h"
 
-struct task_security_struct {
-	u32 osid;		/* SID prior to last execve */
-	u32 sid;		/* current SID */
-	u32 exec_sid;		/* exec SID */
-	u32 create_sid;		/* fscreate SID */
-	u32 keycreate_sid;	/* keycreate SID */
-	u32 sockcreate_sid;	/* fscreate SID */
-};
-
 enum label_initialized {
 	LABEL_INVALID,		/* invalid or not initialized */
 	LABEL_INITIALIZED,	/* initialized */
@@ -145,10 +136,6 @@ struct bpf_security_struct {
 };
 
 extern struct lsm_blob_sizes selinux_blob_sizes;
-static inline struct task_security_struct *selinux_cred(const struct cred *cred)
-{
-	return cred->security + selinux_blob_sizes.lbs_cred;
-}
 
 static inline struct file_security_struct *selinux_file(const struct file *file)
 {
@@ -175,14 +162,4 @@ static inline struct ipc_security_struct *selinux_ipc(
 	return ipc->security + selinux_blob_sizes.lbs_ipc;
 }
 
-/*
- * get the subjective security ID of the current task
- */
-static inline u32 current_sid(void)
-{
-	const struct task_security_struct *tsec = selinux_cred(current_cred());
-
-	return tsec->sid;
-}
-
 #endif /* _SELINUX_OBJSEC_H_ */
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 971fd5f53b6e..380ef3ede216 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -15,6 +15,8 @@
 #include <linux/types.h>
 #include <linux/refcount.h>
 #include <linux/workqueue.h>
+#include <linux/cred.h>
+#include <linux/lsm_hooks.h>
 #include "flask.h"
 
 #define SECSID_NULL			0x00000000 /* unspecified SID */
@@ -134,7 +136,35 @@ static inline struct selinux_ns *get_selinux_ns(struct selinux_ns *ns)
 	return ns;
 }
 
-extern struct selinux_ns *current_selinux_ns;
+struct task_security_struct {
+	u32 osid;		/* SID prior to last execve */
+	u32 sid;		/* current SID */
+	u32 exec_sid;		/* exec SID */
+	u32 create_sid;		/* fscreate SID */
+	u32 keycreate_sid;	/* keycreate SID */
+	u32 sockcreate_sid;	/* fscreate SID */
+	struct selinux_ns *ns;  /* selinux namespace */
+	const struct cred *parent_cred; /* cred in parent ns */
+};
+
+extern struct lsm_blob_sizes selinux_blob_sizes;
+
+static inline struct task_security_struct *selinux_cred(const struct cred *cred)
+{
+	return cred->security + selinux_blob_sizes.lbs_cred;
+}
+
+/*
+ * get the subjective security ID of the current task
+ */
+static inline u32 current_sid(void)
+{
+	const struct task_security_struct *tsec = selinux_cred(current_cred());
+
+	return tsec->sid;
+}
+
+#define current_selinux_ns (selinux_cred(current_cred())->ns)
 
 #ifdef CONFIG_SECURITY_SELINUX_DEVELOP
 static inline bool enforcing_enabled(struct selinux_ns *ns)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 08/10] selinux: introduce cred_selinux_ns() and use it
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (5 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 07/10] selinux: support per-task/cred selinux namespace Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

When using the SID from a cred, we should pass the selinux
namespace associated with the cred on security server calls
rather than the current selinux namespace, since they could differ.
In some of these cases, the cred is always obtained from the current
task so there is no real change, but this is cleaner and hopefully
less fragile. In other cases, the cred could in fact differ.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/hooks.c            | 40 ++++++++++++++---------------
 security/selinux/include/security.h |  2 ++
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 28cc75e5361b..227d5bec868e 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -484,13 +484,13 @@ static int may_context_mount_sb_relabel(u32 sid,
 	const struct task_security_struct *tsec = selinux_cred(cred);
 	int rc;
 
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELFROM, NULL);
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  tsec->sid, sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELTO, NULL);
 	return rc;
@@ -502,13 +502,13 @@ static int may_context_mount_inode_relabel(u32 sid,
 {
 	const struct task_security_struct *tsec = selinux_cred(cred);
 	int rc;
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  tsec->sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__RELABELFROM, NULL);
 	if (rc)
 		return rc;
 
-	rc = avc_has_perm(current_selinux_ns,
+	rc = avc_has_perm(cred_selinux_ns(cred),
 			  sid, sbsec->sid, SECCLASS_FILESYSTEM,
 			  FILESYSTEM__ASSOCIATE, NULL);
 	return rc;
@@ -1672,10 +1672,10 @@ static int cred_has_capability(const struct cred *cred,
 		return -EINVAL;
 	}
 
-	rc = avc_has_perm_noaudit(current_selinux_ns,
+	rc = avc_has_perm_noaudit(cred_selinux_ns(cred),
 				  sid, sid, sclass, av, 0, &avd);
 	if (!(opts & CAP_OPT_NOAUDIT)) {
-		int rc2 = avc_audit(current_selinux_ns,
+		int rc2 = avc_audit(cred_selinux_ns(cred),
 				    sid, sid, sclass, av, &avd, rc, &ad, 0);
 		if (rc2)
 			return rc2;
@@ -1702,7 +1702,7 @@ static int inode_has_perm(const struct cred *cred,
 	sid = cred_sid(cred);
 	isec = selinux_inode(inode);
 
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, isec->sid, isec->sclass, perms, adp);
 }
 
@@ -1776,7 +1776,7 @@ static int file_has_perm(const struct cred *cred,
 	ad.u.file = file;
 
 	if (sid != fsec->sid) {
-		rc = avc_has_perm(current_selinux_ns,
+		rc = avc_has_perm(cred_selinux_ns(cred),
 				  sid, fsec->sid,
 				  SECCLASS_FD,
 				  FD__USE,
@@ -1990,7 +1990,7 @@ static int superblock_has_perm(const struct cred *cred,
 	u32 sid = cred_sid(cred);
 
 	sbsec = sb->s_security;
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
 }
 
@@ -2178,7 +2178,7 @@ static int selinux_capset(struct cred *new, const struct cred *old,
 			  const kernel_cap_t *inheritable,
 			  const kernel_cap_t *permitted)
 {
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(old),
 			    cred_sid(old), cred_sid(new), SECCLASS_PROCESS,
 			    PROCESS__SETCAP, NULL);
 }
@@ -3029,7 +3029,7 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct inode *inode,
 	if (IS_ERR(isec))
 		return PTR_ERR(isec);
 
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, isec->sid, isec->sclass, FILE__READ, &ad);
 }
 
@@ -3084,7 +3084,7 @@ static int selinux_inode_permission(struct inode *inode, int mask)
 	if (IS_ERR(isec))
 		return PTR_ERR(isec);
 
-	rc = avc_has_perm_noaudit(current_selinux_ns,
+	rc = avc_has_perm_noaudit(cred_selinux_ns(cred),
 				  sid, isec->sid, isec->sclass, perms,
 				  (flags & MAY_NOT_BLOCK) ? AVC_NONBLOCKING : 0,
 				  &avd);
@@ -3601,7 +3601,7 @@ static int ioctl_has_perm(const struct cred *cred, struct file *file,
 	ad.u.op->path = file->f_path;
 
 	if (ssid != fsec->sid) {
-		rc = avc_has_perm(current_selinux_ns,
+		rc = avc_has_perm(cred_selinux_ns(cred),
 				  ssid, fsec->sid,
 				SECCLASS_FD,
 				FD__USE,
@@ -3684,7 +3684,7 @@ static int file_map_prot_check(struct file *file, unsigned long prot, int shared
 		 * private file mapping that will also be writable.
 		 * This has an additional check.
 		 */
-		rc = avc_has_perm(current_selinux_ns,
+		rc = avc_has_perm(cred_selinux_ns(cred),
 				  sid, sid, SECCLASS_PROCESS,
 				  PROCESS__EXECMEM, NULL);
 		if (rc)
@@ -3760,14 +3760,14 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 		int rc = 0;
 		if (vma->vm_start >= vma->vm_mm->start_brk &&
 		    vma->vm_end <= vma->vm_mm->brk) {
-			rc = avc_has_perm(current_selinux_ns,
+			rc = avc_has_perm(cred_selinux_ns(cred),
 					  sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECHEAP, NULL);
 		} else if (!vma->vm_file &&
 			   ((vma->vm_start <= vma->vm_mm->start_stack &&
 			     vma->vm_end >= vma->vm_mm->start_stack) ||
 			    vma_is_stack_for_current(vma))) {
-			rc = avc_has_perm(current_selinux_ns,
+			rc = avc_has_perm(cred_selinux_ns(cred),
 					  sid, sid, SECCLASS_PROCESS,
 					  PROCESS__EXECSTACK, NULL);
 		} else if (vma->vm_file && vma->anon_vma) {
@@ -3967,7 +3967,7 @@ static int selinux_kernel_act_as(struct cred *new, u32 secid)
 	u32 sid = current_sid();
 	int ret;
 
-	ret = avc_has_perm(current_selinux_ns,
+	ret = avc_has_perm(tsec->ns,
 			   sid, secid,
 			   SECCLASS_KERNEL_SERVICE,
 			   KERNEL_SERVICE__USE_AS_OVERRIDE,
@@ -3992,7 +3992,7 @@ static int selinux_kernel_create_files_as(struct cred *new, struct inode *inode)
 	u32 sid = current_sid();
 	int ret;
 
-	ret = avc_has_perm(current_selinux_ns,
+	ret = avc_has_perm(tsec->ns,
 			   sid, isec->sid,
 			   SECCLASS_KERNEL_SERVICE,
 			   KERNEL_SERVICE__CREATE_FILES_AS,
@@ -4136,7 +4136,7 @@ static int selinux_task_prlimit(const struct cred *cred, const struct cred *tcre
 		av |= PROCESS__SETRLIMIT;
 	if (flags & LSM_PRLIMIT_READ)
 		av |= PROCESS__GETRLIMIT;
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    cred_sid(cred), cred_sid(tcred),
 			    SECCLASS_PROCESS, av, NULL);
 }
@@ -6612,7 +6612,7 @@ static int selinux_key_permission(key_ref_t key_ref,
 	key = key_ref_to_ptr(key_ref);
 	ksec = key->security;
 
-	return avc_has_perm(current_selinux_ns,
+	return avc_has_perm(cred_selinux_ns(cred),
 			    sid, ksec->sid, SECCLASS_KEY, perm, NULL);
 }
 
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index 380ef3ede216..802644ce1381 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -166,6 +166,8 @@ static inline u32 current_sid(void)
 
 #define current_selinux_ns (selinux_cred(current_cred())->ns)
 
+#define cred_selinux_ns(cred) (selinux_cred(cred)->ns)
+
 #ifdef CONFIG_SECURITY_SELINUX_DEVELOP
 static inline bool enforcing_enabled(struct selinux_ns *ns)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 09/10] selinux: add a selinuxfs interface to unshare selinux namespace
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (6 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 08/10] selinux: introduce cred_selinux_ns() and use it Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-15 13:25 ` [RFC PATCH 10/10] selinuxfs: restrict write operations to the same " Stephen Smalley
  2019-10-18  2:32 ` [RFC PATCH 00/10] SELinux namespace series, re-based Paul Moore
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

DO NOT MERGE - experimental, unsafe code.  You have been warned.

Provide a userspace API to unshare the selinux namespace.
Currently implemented via a selinuxfs node. This could be
coupled with unsharing of other namespaces (e.g.  mount namespace,
network namespace) that will always be needed or left independent.
Don't get hung up on the interface itself, it is just to allow
experimentation and testing.

Sample usage:
echo 1 > /sys/fs/selinux/unshare
unshare -m -n
umount /sys/fs/selinux
mount -t selinuxfs none /sys/fs/selinux
load_policy
getenforce
id
echo $$

The above will show that the process now views itself as running in the
kernel domain in permissive mode, as would be the case at boot.
From a different shell on the host system, running ps -eZ or
cat /proc/<pid>/attr/current will show that the process that
unshared its selinux namespace is still running in its original
context in the initial namespace, and getenforce will show the
the initial namespace remains enforcing.  Enforcing mode or policy
changes in the child will not affect the parent.

This is not yet safe; do not use on production systems.
Known issues include at least the following items:

* The policy loading code has not been thoroughly audited
and hardened for use by unprivileged code, both with respect to
ensuring that the policy is internally consistent and restricting
the range of values used from the policy as loop bounds and memory
allocation sizes to sane limits.

* The SELinux hook functions have not been modified to be
namespace-aware, so the hooks only perform checking against the
current namespace.  Thus, unsharing allows the process to escape
confinement by the parent.  Fixing this requires updating each hook to
perform its processing on the current namespace and all of its ancestors
up to the init namespace.

* Some of the hook functions can be called outside of process context
(e.g. task_kill, send_sigiotask, network input/forward) and should not use
the current task's selinux namespace. These hooks need to be updated to
obtain the proper selinux namespace to use instead from the caller or
cached in a suitable data structure (e.g. the file or sock security
structures).

* The support for per-namespace inode and superblock security blobs has
been dropped from this series pending a rewrite to address blob
lifecycle management by the security framework and a possible change in
approach.  Hence, they also now fall under the proviso below for other
objects.

* Object security blobs have not been updated to be namespace-aware and
support multiple namespaces.  Hence, the hooks could end up performing
permission checks or other operations on SIDs created in a different
selinux namespace, yielding denials on unlabeled contexts or completely
random contexts that happen to be mapped to that SID.

* The network SID caches (netif, netnode, netport) have not yet
been instantiated per selinux namespace, unlike the AVC and SS.

* There is no way currently to restrict or bound nesting of
namespaces; if you allow it to a domain in the init namespace,
then that domain can in turn unshare to arbitrary depths and can
grant the same to any domain in its own policy.  Related to this
is the fact that there is no way to control resource usage due to
selinux namespaces and they can be substantial (per-namespace
policydb, sidtab, AVC, etc).

* SIDs may be cached by audit and networking code and in external
kernel data structures and used later, potentially in a different
selinux namespace than the one in which the SID was originally created.

* No doubt other things I'm forgetting or haven't thought of.
Use at your own risk.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/include/classmap.h |  3 +-
 security/selinux/selinuxfs.c        | 66 +++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 32e9b03be3dd..9e911e5931bf 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -42,7 +42,8 @@ struct security_class_mapping secclass_map[] = {
 	  { "compute_av", "compute_create", "compute_member",
 	    "check_context", "load_policy", "compute_relabel",
 	    "compute_user", "setenforce", "setbool", "setsecparam",
-	    "setcheckreqprot", "read_policy", "validate_trans", NULL } },
+	    "setcheckreqprot", "read_policy", "validate_trans", "unshare",
+	    NULL } },
 	{ "process",
 	  { "fork", "transition", "sigchld", "sigkill",
 	    "sigstop", "signull", "signal", "ptrace", "getsched", "setsched",
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 41270a783cf5..48afdc3a4aa4 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -62,6 +62,7 @@ enum sel_inos {
 	SEL_STATUS,	/* export current status using mmap() */
 	SEL_POLICY,	/* allow userspace to read the in kernel policy */
 	SEL_VALIDATE_TRANS, /* compute validatetrans decision */
+	SEL_UNSHARE,	    /* unshare selinux namespace */
 	SEL_INO_NEXT,	/* The next inode number to use */
 };
 
@@ -325,6 +326,70 @@ static const struct file_operations sel_disable_ops = {
 	.llseek		= generic_file_llseek,
 };
 
+static ssize_t sel_write_unshare(struct file *file, const char __user *buf,
+				 size_t count, loff_t *ppos)
+
+{
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
+	struct selinux_ns *ns = fsi->ns;
+	char *page;
+	ssize_t length;
+	bool set;
+	int rc;
+
+	if (count >= PAGE_SIZE)
+		return -ENOMEM;
+
+	/* No partial writes. */
+	if (*ppos != 0)
+		return -EINVAL;
+
+	rc = avc_has_perm(current_selinux_ns, current_sid(),
+			  SECINITSID_SECURITY, SECCLASS_SECURITY,
+			  SECURITY__UNSHARE, NULL);
+	if (rc)
+		return rc;
+
+	page = memdup_user_nul(buf, count);
+	if (IS_ERR(page))
+		return PTR_ERR(page);
+
+	length = -EINVAL;
+	if (kstrtobool(page, &set))
+		goto out;
+
+	if (set) {
+		struct cred *cred = prepare_creds();
+		struct task_security_struct *tsec;
+
+		if (!cred) {
+			length = -ENOMEM;
+			goto out;
+		}
+		tsec = selinux_cred(cred);
+		if (selinux_ns_create(ns, &tsec->ns)) {
+			abort_creds(cred);
+			length = -ENOMEM;
+			goto out;
+		}
+		tsec->osid = tsec->sid = SECINITSID_KERNEL;
+		tsec->exec_sid = tsec->create_sid = tsec->keycreate_sid =
+			tsec->sockcreate_sid = SECSID_NULL;
+		tsec->parent_cred = get_current_cred();
+		commit_creds(cred);
+	}
+
+	length = count;
+out:
+	kfree(page);
+	return length;
+}
+
+static const struct file_operations sel_unshare_ops = {
+	.write		= sel_write_unshare,
+	.llseek		= generic_file_llseek,
+};
+
 static ssize_t sel_read_policyvers(struct file *filp, char __user *buf,
 				   size_t count, loff_t *ppos)
 {
@@ -1917,6 +1982,7 @@ static int sel_fill_super(struct super_block *sb, struct fs_context *fc)
 		[SEL_POLICY] = {"policy", &sel_policy_ops, S_IRUGO},
 		[SEL_VALIDATE_TRANS] = {"validatetrans", &sel_transition_ops,
 					S_IWUGO},
+		[SEL_UNSHARE] = {"unshare", &sel_unshare_ops, 0200},
 		/* last one */ {""}
 	};
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 10/10] selinuxfs: restrict write operations to the same selinux namespace
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (7 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
@ 2019-10-15 13:25 ` Stephen Smalley
  2019-10-18  2:32 ` [RFC PATCH 00/10] SELinux namespace series, re-based Paul Moore
  9 siblings, 0 replies; 11+ messages in thread
From: Stephen Smalley @ 2019-10-15 13:25 UTC (permalink / raw)
  To: selinux; +Cc: paul, Stephen Smalley

This ensures that once a process unshares its selinux namespace,
it can no longer act on the parent namespace's selinuxfs instance,
irrespective of policy.  This is a safety measure so that even if
an otherwise unconfined process unshares its selinux namespace, it
won't be able to subsequently affect the enforcing mode or policy of the
parent.  This also helps avoid common mistakes like failing to create
a mount namespace and mount a new selinuxfs instance in order to act
on one's own selinux namespace after unsharing.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
 security/selinux/selinuxfs.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 48afdc3a4aa4..1ba4d874fc86 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -143,6 +143,9 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 	ssize_t length;
 	int old_value, new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -284,6 +287,9 @@ static ssize_t sel_write_disable(struct file *file, const char __user *buf,
 	int new_value;
 	int enforcing;
 
+	if (fsi->ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -337,6 +343,9 @@ static ssize_t sel_write_unshare(struct file *file, const char __user *buf,
 	bool set;
 	int rc;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -601,6 +610,9 @@ static ssize_t sel_write_load(struct file *file, const char __user *buf,
 	ssize_t length;
 	void *data = NULL;
 
+	if (fsi->ns != current_selinux_ns)
+		return -EPERM;
+
 	mutex_lock(&fsi->mutex);
 
 	length = avc_has_perm(current_selinux_ns,
@@ -706,6 +718,9 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 	ssize_t length;
 	unsigned int new_value;
 
+	if (fsi->ns != current_selinux_ns)
+		return -EPERM;
+
 	length = avc_has_perm(current_selinux_ns,
 			      current_sid(), SECINITSID_SECURITY,
 			      SECCLASS_SECURITY, SECURITY__SETCHECKREQPROT,
@@ -752,6 +767,9 @@ static ssize_t sel_write_validatetrans(struct file *file,
 	u16 tclass;
 	int rc;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	rc = avc_has_perm(current_selinux_ns,
 			  current_sid(), SECINITSID_SECURITY,
 			  SECCLASS_SECURITY, SECURITY__VALIDATE_TRANS, NULL);
@@ -839,10 +857,14 @@ static ssize_t (*const write_op[])(struct file *, char *, size_t) = {
 
 static ssize_t selinux_transaction_write(struct file *file, const char __user *buf, size_t size, loff_t *pos)
 {
+	struct selinux_fs_info *fsi = file_inode(file)->i_sb->s_fs_info;
 	ino_t ino = file_inode(file)->i_ino;
 	char *data;
 	ssize_t rv;
 
+	if (fsi->ns != current_selinux_ns)
+		return -EPERM;
+
 	if (ino >= ARRAY_SIZE(write_op) || !write_op[ino])
 		return -EINVAL;
 
@@ -1278,6 +1300,9 @@ static ssize_t sel_write_bool(struct file *filep, const char __user *buf,
 	unsigned index = file_inode(filep)->i_ino & SEL_INO_MASK;
 	const char *name = filep->f_path.dentry->d_name.name;
 
+	if (fsi->ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -1334,6 +1359,9 @@ static ssize_t sel_commit_bools_write(struct file *filep,
 	ssize_t length;
 	int new_value;
 
+	if (fsi->ns != current_selinux_ns)
+		return -EPERM;
+
 	if (count >= PAGE_SIZE)
 		return -ENOMEM;
 
@@ -1498,6 +1526,9 @@ static ssize_t sel_write_avc_cache_threshold(struct file *file,
 	ssize_t ret;
 	unsigned int new_value;
 
+	if (ns != current_selinux_ns)
+		return -EPERM;
+
 	ret = avc_has_perm(current_selinux_ns,
 			   current_sid(), SECINITSID_SECURITY,
 			   SECCLASS_SECURITY, SECURITY__SETSECPARAM,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 00/10] SELinux namespace series, re-based
  2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
                   ` (8 preceding siblings ...)
  2019-10-15 13:25 ` [RFC PATCH 10/10] selinuxfs: restrict write operations to the same " Stephen Smalley
@ 2019-10-18  2:32 ` Paul Moore
  9 siblings, 0 replies; 11+ messages in thread
From: Paul Moore @ 2019-10-18  2:32 UTC (permalink / raw)
  To: selinux; +Cc: Stephen Smalley

On Tue, Oct 15, 2019 at 9:25 AM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> After a long hiatus, I have re-based the SELinux namespace series
> on top of selinux/next based on v5.4-rc1 ...

Thanks Stephen.

As mentioned previously at LSS-NA, in an effort to get this moving
again I've pulled this into the 'working-selinuxns' branch of the main
SELinux kernel repository and I'll keep that branch actively rebased
against current kernel releases* until we get to a point where we can
merge it all into selinux/next.  If you are reading this email and are
interested in helping with the SELinux namespacing work, patches are
welcome (and encouraged!) :)

I would ask that those of you who are interested in participating send
your patches with tag on the patch posting so we all have the correct
context in which to review your work, e.g. "[PATCH selinuxns] selinux:
make it work".

* My initial thought is to rebase the selinux/working-selinuxns at the
end of each merge window against the vX.Y-rc1, but if we need to do it
more often we can.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-10-18  5:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-15 13:25 [RFC PATCH 00/10] SELinux namespace series, re-based Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 02/10] selinux: support multiple selinuxfs instances Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 03/10] selinux: dynamically allocate selinux namespace Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 04/10] selinuxns: mark init_selinux_ns as __ro_after_init Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 05/10] selinux: Annotate lockdep for services locks Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 06/10] netns,selinux: create the selinux netlink socket per network namespace Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 07/10] selinux: support per-task/cred selinux namespace Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 08/10] selinux: introduce cred_selinux_ns() and use it Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 09/10] selinux: add a selinuxfs interface to unshare selinux namespace Stephen Smalley
2019-10-15 13:25 ` [RFC PATCH 10/10] selinuxfs: restrict write operations to the same " Stephen Smalley
2019-10-18  2:32 ` [RFC PATCH 00/10] SELinux namespace series, re-based Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).