selinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Introduce security_create_user_ns()
@ 2022-07-21 17:28 Frederick Lawler
  2022-07-21 17:28 ` [PATCH v3 1/4] security, lsm: " Frederick Lawler
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Frederick Lawler @ 2022-07-21 17:28 UTC (permalink / raw)
  To: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest
  Cc: linux-kernel, netdev, kernel-team, cgzones, karl, Frederick Lawler

While creating a LSM BPF MAC policy to block user namespace creation, we
used the LSM cred_prepare hook because that is the closest hook to prevent
a call to create_user_ns().

The calls look something like this:

    cred = prepare_creds()
        security_prepare_creds()
            call_int_hook(cred_prepare, ...
    if (cred)
        create_user_ns(cred)

We noticed that error codes were not propagated from this hook and
introduced a patch [1] to propagate those errors.

The discussion notes that security_prepare_creds()
is not appropriate for MAC policies, and instead the hook is
meant for LSM authors to prepare credentials for mutation. [2]

Ultimately, we concluded that a better course of action is to introduce
a new security hook for LSM authors. [3]

This patch set first introduces a new security_create_user_ns() function
and userns_create LSM hook, then marks the hook as sleepable in BPF.

Links:
1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/

Past discussions:
V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/

Changes since v2:
- Rename create_user_ns hook to userns_create
- Use user_namespace as an object opposed to a generic namespace object
- s/domB_t/domA_t in commit message
Changes since v1:
- Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
- Add selinux: Implement create_user_ns hook patch
- Change function signature of security_create_user_ns() to only take
  struct cred
- Move security_create_user_ns() call after id mapping check in
  create_user_ns()
- Update documentation to reflect changes

Frederick Lawler (4):
  security, lsm: Introduce security_create_user_ns()
  bpf-lsm: Make bpf_lsm_userns_create() sleepable
  selftests/bpf: Add tests verifying bpf lsm userns_create hook
  selinux: Implement userns_create hook

 include/linux/lsm_hook_defs.h                 |  1 +
 include/linux/lsm_hooks.h                     |  4 +
 include/linux/security.h                      |  6 ++
 kernel/bpf/bpf_lsm.c                          |  1 +
 kernel/user_namespace.c                       |  5 ++
 security/security.c                           |  5 ++
 security/selinux/hooks.c                      |  9 ++
 security/selinux/include/classmap.h           |  2 +
 .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
 .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
 10 files changed, 160 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c

--
2.30.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/4] security, lsm: Introduce security_create_user_ns()
  2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
@ 2022-07-21 17:28 ` Frederick Lawler
  2022-07-22  8:21   ` Christian Brauner
  2022-07-21 17:28 ` [PATCH v3 2/4] bpf-lsm: Make bpf_lsm_userns_create() sleepable Frederick Lawler
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Frederick Lawler @ 2022-07-21 17:28 UTC (permalink / raw)
  To: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest
  Cc: linux-kernel, netdev, kernel-team, cgzones, karl, Frederick Lawler

Preventing user namespace (privileged or otherwise) creation comes in a
few of forms in order of granularity:

        1. /proc/sys/user/max_user_namespaces sysctl
        2. OS specific patch(es)
        3. CONFIG_USER_NS

To block a task based on its attributes, the LSM hook cred_prepare is a
good candidate for use because it provides more granular control, and
it is called before create_user_ns():

        cred = prepare_creds()
                security_prepare_creds()
                        call_int_hook(cred_prepare, ...
        if (cred)
                create_user_ns(cred)

Since security_prepare_creds() is meant for LSMs to copy and prepare
credentials, access control is an unintended use of the hook. Therefore
introduce a new function security_create_user_ns() with an accompanying
userns_create LSM hook.

This hook takes the prepared creds for LSM authors to write policy
against. On success, the new namespace is applied to credentials,
otherwise an error is returned.

Signed-off-by: Frederick Lawler <fred@cloudflare.com>

---
Changes since v2:
- Rename create_user_ns hook to userns_create
Changes since v1:
- Changed commit wording
- Moved execution to be after id mapping check
- Changed signature to only accept a const struct cred *
---
 include/linux/lsm_hook_defs.h | 1 +
 include/linux/lsm_hooks.h     | 4 ++++
 include/linux/security.h      | 6 ++++++
 kernel/user_namespace.c       | 5 +++++
 security/security.c           | 5 +++++
 5 files changed, 21 insertions(+)

diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index eafa1d2489fd..7ff93cb8ca8d 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -223,6 +223,7 @@ LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
 	 unsigned long arg3, unsigned long arg4, unsigned long arg5)
 LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
 	 struct inode *inode)
+LSM_HOOK(int, 0, userns_create, const struct cred *cred)
 LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
 LSM_HOOK(void, LSM_RET_VOID, ipc_getsecid, struct kern_ipc_perm *ipcp,
 	 u32 *secid)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 91c8146649f5..54fe534d0e01 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -799,6 +799,10 @@
  *	security attributes, e.g. for /proc/pid inodes.
  *	@p contains the task_struct for the task.
  *	@inode contains the inode structure for the inode.
+ * @userns_create:
+ *	Check permission prior to creating a new user namespace.
+ *	@cred points to prepared creds.
+ *	Return 0 if successful, otherwise < 0 error code.
  *
  * Security hooks for Netlink messaging.
  *
diff --git a/include/linux/security.h b/include/linux/security.h
index 7fc4e9f49f54..a195bf33246a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -435,6 +435,7 @@ int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
 int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			unsigned long arg4, unsigned long arg5);
 void security_task_to_inode(struct task_struct *p, struct inode *inode);
+int security_create_user_ns(const struct cred *cred);
 int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
 void security_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid);
 int security_msg_msg_alloc(struct msg_msg *msg);
@@ -1185,6 +1186,11 @@ static inline int security_task_prctl(int option, unsigned long arg2,
 static inline void security_task_to_inode(struct task_struct *p, struct inode *inode)
 { }
 
+static inline int security_create_user_ns(const struct cred *cred)
+{
+	return 0;
+}
+
 static inline int security_ipc_permission(struct kern_ipc_perm *ipcp,
 					  short flag)
 {
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 5481ba44a8d6..3f464bbda0e9 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -9,6 +9,7 @@
 #include <linux/highuid.h>
 #include <linux/cred.h>
 #include <linux/securebits.h>
+#include <linux/security.h>
 #include <linux/keyctl.h>
 #include <linux/key-type.h>
 #include <keys/user-type.h>
@@ -113,6 +114,10 @@ int create_user_ns(struct cred *new)
 	    !kgid_has_mapping(parent_ns, group))
 		goto fail_dec;
 
+	ret = security_create_user_ns(new);
+	if (ret < 0)
+		goto fail_dec;
+
 	ret = -ENOMEM;
 	ns = kmem_cache_zalloc(user_ns_cachep, GFP_KERNEL);
 	if (!ns)
diff --git a/security/security.c b/security/security.c
index 188b8f782220..ec9b4696e86c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1903,6 +1903,11 @@ void security_task_to_inode(struct task_struct *p, struct inode *inode)
 	call_void_hook(task_to_inode, p, inode);
 }
 
+int security_create_user_ns(const struct cred *cred)
+{
+	return call_int_hook(userns_create, 0, cred);
+}
+
 int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag)
 {
 	return call_int_hook(ipc_permission, 0, ipcp, flag);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/4] bpf-lsm: Make bpf_lsm_userns_create() sleepable
  2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
  2022-07-21 17:28 ` [PATCH v3 1/4] security, lsm: " Frederick Lawler
@ 2022-07-21 17:28 ` Frederick Lawler
  2022-07-22  8:18   ` Christian Brauner
  2022-07-21 17:28 ` [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook Frederick Lawler
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Frederick Lawler @ 2022-07-21 17:28 UTC (permalink / raw)
  To: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest
  Cc: linux-kernel, netdev, kernel-team, cgzones, karl, Frederick Lawler

Users may want to audit calls to security_create_user_ns() and access
user space memory. Also create_user_ns() runs without
pagefault_disabled(). Therefore, make bpf_lsm_userns_create() sleepable
for mandatory access control policies.

Signed-off-by: Frederick Lawler <fred@cloudflare.com>

---
Changes since v2:
- Rename create_user_ns hook to userns_create
Changes since v1:
- None
---
 kernel/bpf/bpf_lsm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index c1351df9f7ee..4593437809cc 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -250,6 +250,7 @@ BTF_ID(func, bpf_lsm_task_getsecid_obj)
 BTF_ID(func, bpf_lsm_task_prctl)
 BTF_ID(func, bpf_lsm_task_setscheduler)
 BTF_ID(func, bpf_lsm_task_to_inode)
+BTF_ID(func, bpf_lsm_userns_create)
 BTF_SET_END(sleepable_lsm_hooks)
 
 bool bpf_lsm_is_sleepable_hook(u32 btf_id)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook
  2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
  2022-07-21 17:28 ` [PATCH v3 1/4] security, lsm: " Frederick Lawler
  2022-07-21 17:28 ` [PATCH v3 2/4] bpf-lsm: Make bpf_lsm_userns_create() sleepable Frederick Lawler
@ 2022-07-21 17:28 ` Frederick Lawler
  2022-07-22  6:07   ` Martin KaFai Lau
  2022-07-22  8:15   ` Christian Brauner
  2022-07-21 17:28 ` [PATCH v3 4/4] selinux: Implement " Frederick Lawler
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Frederick Lawler @ 2022-07-21 17:28 UTC (permalink / raw)
  To: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest
  Cc: linux-kernel, netdev, kernel-team, cgzones, karl, Frederick Lawler

The LSM hook userns_create was introduced to provide LSM's an
opportunity to block or allow unprivileged user namespace creation. This
test serves two purposes: it provides a test eBPF implementation, and
tests the hook successfully blocks or allows user namespace creation.

This tests 4 cases:

        1. Unattached bpf program does not block unpriv user namespace
           creation.
        2. Attached bpf program allows user namespace creation given
           CAP_SYS_ADMIN privileges.
        3. Attached bpf program denies user namespace creation for a
           user without CAP_SYS_ADMIN.
        4. The sleepable implementation loads

Signed-off-by: Frederick Lawler <fred@cloudflare.com>

---
The generic deny_namespace file name is used for future namespace
expansion. I didn't want to limit these files to just the create_user_ns
hook.
Changes since v2:
- Rename create_user_ns hook to userns_create
Changes since v1:
- Introduce this patch
---
 .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
 .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
 2 files changed, 127 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c

diff --git a/tools/testing/selftests/bpf/prog_tests/deny_namespace.c b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
new file mode 100644
index 000000000000..9e4714295008
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include "test_deny_namespace.skel.h"
+#include <sched.h>
+#include "cap_helpers.h"
+
+#define STACK_SIZE (1024 * 1024)
+static char child_stack[STACK_SIZE];
+
+int clone_callback(void *arg)
+{
+	return 0;
+}
+
+static int create_new_user_ns(void)
+{
+	int status;
+	pid_t cpid;
+
+	cpid = clone(clone_callback, child_stack + STACK_SIZE,
+		     CLONE_NEWUSER | SIGCHLD, NULL);
+
+	if (cpid == -1)
+		return errno;
+
+	if (cpid == 0)
+		return 0;
+
+	waitpid(cpid, &status, 0);
+	if (WIFEXITED(status))
+		return WEXITSTATUS(status);
+
+	return -1;
+}
+
+static void test_userns_create_bpf(void)
+{
+	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
+	__u64 old_caps = 0;
+
+	ASSERT_OK(create_new_user_ns(), "priv new user ns");
+
+	cap_disable_effective(cap_mask, &old_caps);
+
+	ASSERT_EQ(create_new_user_ns(), EPERM, "unpriv new user ns");
+
+	if (cap_mask & old_caps)
+		cap_enable_effective(cap_mask, NULL);
+}
+
+static void test_unpriv_userns_create_no_bpf(void)
+{
+	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
+	__u64 old_caps = 0;
+
+	cap_disable_effective(cap_mask, &old_caps);
+
+	ASSERT_OK(create_new_user_ns(), "no-bpf unpriv new user ns");
+
+	if (cap_mask & old_caps)
+		cap_enable_effective(cap_mask, NULL);
+}
+
+void test_deny_namespace(void)
+{
+	struct test_deny_namespace *skel = NULL;
+	int err;
+
+	if (test__start_subtest("unpriv_userns_create_no_bpf"))
+		test_unpriv_userns_create_no_bpf();
+
+	skel = test_deny_namespace__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel load"))
+		goto close_prog;
+
+	err = test_deny_namespace__attach(skel);
+	if (!ASSERT_OK(err, "attach"))
+		goto close_prog;
+
+	if (test__start_subtest("userns_create_bpf"))
+		test_userns_create_bpf();
+
+	test_deny_namespace__detach(skel);
+
+close_prog:
+	test_deny_namespace__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_deny_namespace.c b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
new file mode 100644
index 000000000000..9ec9dabc8372
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <errno.h>
+#include <linux/capability.h>
+
+struct kernel_cap_struct {
+	__u32 cap[_LINUX_CAPABILITY_U32S_3];
+} __attribute__((preserve_access_index));
+
+struct cred {
+	struct kernel_cap_struct cap_effective;
+} __attribute__((preserve_access_index));
+
+char _license[] SEC("license") = "GPL";
+
+SEC("lsm/userns_create")
+int BPF_PROG(test_userns_create, const struct cred *cred, int ret)
+{
+	struct kernel_cap_struct caps = cred->cap_effective;
+	int cap_index = CAP_TO_INDEX(CAP_SYS_ADMIN);
+	__u32 cap_mask = CAP_TO_MASK(CAP_SYS_ADMIN);
+
+	if (ret)
+		return 0;
+
+	ret = -EPERM;
+	if (caps.cap[cap_index] & cap_mask)
+		return 0;
+
+	return -EPERM;
+}
+
+SEC("lsm.s/userns_create")
+int BPF_PROG(test_sleepable_userns_create, const struct cred *cred, int ret)
+{
+	return 0;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/4] selinux: Implement userns_create hook
  2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
                   ` (2 preceding siblings ...)
  2022-07-21 17:28 ` [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook Frederick Lawler
@ 2022-07-21 17:28 ` Frederick Lawler
  2022-07-22  6:11 ` [PATCH v3 0/4] Introduce security_create_user_ns() Martin KaFai Lau
  2022-07-22 17:05 ` Eric W. Biederman
  5 siblings, 0 replies; 25+ messages in thread
From: Frederick Lawler @ 2022-07-21 17:28 UTC (permalink / raw)
  To: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest
  Cc: linux-kernel, netdev, kernel-team, cgzones, karl, Frederick Lawler

Unprivileged user namespace creation is an intended feature to enable
sandboxing, however this feature is often used to as an initial step to
perform a privilege escalation attack.

This patch implements a new user_namespace { create } access control
permission to restrict which domains allow or deny user namespace
creation. This is necessary for system administrators to quickly protect
their systems while waiting for vulnerability patches to be applied.

This permission can be used in the following way:

        allow domA_t domA_t : user_namespace { create };

Signed-off-by: Frederick Lawler <fred@cloudflare.com>

---
Changes since v2:
- Rename create_user_ns hook to userns_create
- Use user_namespace as an object opposed to a generic namespace object
- s/domB_t/domA_t in commit message
Changes since v1:
- Introduce this patch
---
 security/selinux/hooks.c            | 9 +++++++++
 security/selinux/include/classmap.h | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index beceb89f68d9..afc9da0249e7 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4227,6 +4227,14 @@ static void selinux_task_to_inode(struct task_struct *p,
 	spin_unlock(&isec->lock);
 }
 
+static int selinux_userns_create(const struct cred *cred)
+{
+	u32 sid = current_sid();
+
+	return avc_has_perm(&selinux_state, sid, sid, SECCLASS_USER_NAMESPACE,
+						USER_NAMESPACE__CREATE, NULL);
+}
+
 /* Returns error only if unable to parse addresses */
 static int selinux_parse_skb_ipv4(struct sk_buff *skb,
 			struct common_audit_data *ad, u8 *proto)
@@ -7117,6 +7125,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(task_movememory, selinux_task_movememory),
 	LSM_HOOK_INIT(task_kill, selinux_task_kill),
 	LSM_HOOK_INIT(task_to_inode, selinux_task_to_inode),
+	LSM_HOOK_INIT(userns_create, selinux_userns_create),
 
 	LSM_HOOK_INIT(ipc_permission, selinux_ipc_permission),
 	LSM_HOOK_INIT(ipc_getsecid, selinux_ipc_getsecid),
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index ff757ae5f253..0bff55bb9cde 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -254,6 +254,8 @@ const struct security_class_mapping secclass_map[] = {
 	  { COMMON_FILE_PERMS, NULL } },
 	{ "io_uring",
 	  { "override_creds", "sqpoll", NULL } },
+	{ "user_namespace",
+	  { "create", NULL } },
 	{ NULL }
   };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook
  2022-07-21 17:28 ` [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook Frederick Lawler
@ 2022-07-22  6:07   ` Martin KaFai Lau
  2022-07-22 13:41     ` Frederick Lawler
  2022-07-22  8:15   ` Christian Brauner
  1 sibling, 1 reply; 25+ messages in thread
From: Martin KaFai Lau @ 2022-07-22  6:07 UTC (permalink / raw)
  To: Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, songliubraving,
	yhs, john.fastabend, jmorris, serge, paul, stephen.smalley.work,
	eparis, shuah, brauner, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On Thu, Jul 21, 2022 at 12:28:07PM -0500, Frederick Lawler wrote:
> The LSM hook userns_create was introduced to provide LSM's an
> opportunity to block or allow unprivileged user namespace creation. This
> test serves two purposes: it provides a test eBPF implementation, and
> tests the hook successfully blocks or allows user namespace creation.
> 
> This tests 4 cases:
> 
>         1. Unattached bpf program does not block unpriv user namespace
>            creation.
>         2. Attached bpf program allows user namespace creation given
>            CAP_SYS_ADMIN privileges.
>         3. Attached bpf program denies user namespace creation for a
>            user without CAP_SYS_ADMIN.
>         4. The sleepable implementation loads
> 
> Signed-off-by: Frederick Lawler <fred@cloudflare.com>
> 
> ---
> The generic deny_namespace file name is used for future namespace
> expansion. I didn't want to limit these files to just the create_user_ns
> hook.
> Changes since v2:
> - Rename create_user_ns hook to userns_create
> Changes since v1:
> - Introduce this patch
> ---
>  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>  2 files changed, 127 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/deny_namespace.c b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> new file mode 100644
> index 000000000000..9e4714295008
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> @@ -0,0 +1,88 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define _GNU_SOURCE
> +#include <test_progs.h>
> +#include "test_deny_namespace.skel.h"
> +#include <sched.h>
> +#include "cap_helpers.h"
> +
> +#define STACK_SIZE (1024 * 1024)
Does the child need 1M stack space ?

> +static char child_stack[STACK_SIZE];
> +
> +int clone_callback(void *arg)
static

> +{
> +	return 0;
> +}
> +
> +static int create_new_user_ns(void)
> +{
> +	int status;
> +	pid_t cpid;
> +
> +	cpid = clone(clone_callback, child_stack + STACK_SIZE,
> +		     CLONE_NEWUSER | SIGCHLD, NULL);
> +
> +	if (cpid == -1)
> +		return errno;
> +
> +	if (cpid == 0)
Not an expert in clone() call and it is not clear what 0
return value mean from the man page.  Could you explain ?

> +		return 0;
> +
> +	waitpid(cpid, &status, 0);
> +	if (WIFEXITED(status))
> +		return WEXITSTATUS(status);
> +
> +	return -1;
> +}
> +
> +static void test_userns_create_bpf(void)
> +{
> +	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
> +	__u64 old_caps = 0;
> +
> +	ASSERT_OK(create_new_user_ns(), "priv new user ns");
Does it need to enable CAP_SYS_ADMIN first ?

> +
> +	cap_disable_effective(cap_mask, &old_caps);
> +
> +	ASSERT_EQ(create_new_user_ns(), EPERM, "unpriv new user ns");
> +
> +	if (cap_mask & old_caps)
> +		cap_enable_effective(cap_mask, NULL);
> +}
> +
> +static void test_unpriv_userns_create_no_bpf(void)
> +{
> +	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
> +	__u64 old_caps = 0;
> +
> +	cap_disable_effective(cap_mask, &old_caps);
> +
> +	ASSERT_OK(create_new_user_ns(), "no-bpf unpriv new user ns");
> +
> +	if (cap_mask & old_caps)
> +		cap_enable_effective(cap_mask, NULL);
> +}
> +
> +void test_deny_namespace(void)
> +{
> +	struct test_deny_namespace *skel = NULL;
> +	int err;
> +
> +	if (test__start_subtest("unpriv_userns_create_no_bpf"))
> +		test_unpriv_userns_create_no_bpf();
> +
> +	skel = test_deny_namespace__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel load"))
> +		goto close_prog;
> +
> +	err = test_deny_namespace__attach(skel);
> +	if (!ASSERT_OK(err, "attach"))
> +		goto close_prog;
> +
> +	if (test__start_subtest("userns_create_bpf"))
> +		test_userns_create_bpf();
> +
> +	test_deny_namespace__detach(skel);
> +
> +close_prog:
> +	test_deny_namespace__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_deny_namespace.c b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
> new file mode 100644
> index 000000000000..9ec9dabc8372
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <errno.h>
> +#include <linux/capability.h>
> +
> +struct kernel_cap_struct {
> +	__u32 cap[_LINUX_CAPABILITY_U32S_3];
> +} __attribute__((preserve_access_index));
> +
> +struct cred {
> +	struct kernel_cap_struct cap_effective;
> +} __attribute__((preserve_access_index));
> +
> +char _license[] SEC("license") = "GPL";
> +
> +SEC("lsm/userns_create")
> +int BPF_PROG(test_userns_create, const struct cred *cred, int ret)
> +{
> +	struct kernel_cap_struct caps = cred->cap_effective;
> +	int cap_index = CAP_TO_INDEX(CAP_SYS_ADMIN);
> +	__u32 cap_mask = CAP_TO_MASK(CAP_SYS_ADMIN);
> +
> +	if (ret)
> +		return 0;
> +
> +	ret = -EPERM;
> +	if (caps.cap[cap_index] & cap_mask)
> +		return 0;
> +
> +	return -EPERM;
> +}
> +
> +SEC("lsm.s/userns_create")
> +int BPF_PROG(test_sleepable_userns_create, const struct cred *cred, int ret)
> +{
An empty program is weird.  If the intention is
to ensure a sleepable program can attach to userns_create,
move the test logic here and remove the non-sleepable
program above.

> +	return 0;
> +}
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
                   ` (3 preceding siblings ...)
  2022-07-21 17:28 ` [PATCH v3 4/4] selinux: Implement " Frederick Lawler
@ 2022-07-22  6:11 ` Martin KaFai Lau
  2022-07-22 12:20   ` Paul Moore
  2022-07-22 17:05 ` Eric W. Biederman
  5 siblings, 1 reply; 25+ messages in thread
From: Martin KaFai Lau @ 2022-07-22  6:11 UTC (permalink / raw)
  To: Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, songliubraving,
	yhs, john.fastabend, jmorris, serge, paul, stephen.smalley.work,
	eparis, shuah, brauner, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> While creating a LSM BPF MAC policy to block user namespace creation, we
> used the LSM cred_prepare hook because that is the closest hook to prevent
> a call to create_user_ns().
> 
> The calls look something like this:
> 
>     cred = prepare_creds()
>         security_prepare_creds()
>             call_int_hook(cred_prepare, ...
>     if (cred)
>         create_user_ns(cred)
> 
> We noticed that error codes were not propagated from this hook and
> introduced a patch [1] to propagate those errors.
> 
> The discussion notes that security_prepare_creds()
> is not appropriate for MAC policies, and instead the hook is
> meant for LSM authors to prepare credentials for mutation. [2]
> 
> Ultimately, we concluded that a better course of action is to introduce
> a new security hook for LSM authors. [3]
> 
> This patch set first introduces a new security_create_user_ns() function
> and userns_create LSM hook, then marks the hook as sleepable in BPF.
Patch 1 and 4 still need review from the lsm/security side.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook
  2022-07-21 17:28 ` [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook Frederick Lawler
  2022-07-22  6:07   ` Martin KaFai Lau
@ 2022-07-22  8:15   ` Christian Brauner
  1 sibling, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2022-07-22  8:15 UTC (permalink / raw)
  To: Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On Thu, Jul 21, 2022 at 12:28:07PM -0500, Frederick Lawler wrote:
> The LSM hook userns_create was introduced to provide LSM's an
> opportunity to block or allow unprivileged user namespace creation. This
> test serves two purposes: it provides a test eBPF implementation, and
> tests the hook successfully blocks or allows user namespace creation.
> 
> This tests 4 cases:
> 
>         1. Unattached bpf program does not block unpriv user namespace
>            creation.
>         2. Attached bpf program allows user namespace creation given
>            CAP_SYS_ADMIN privileges.
>         3. Attached bpf program denies user namespace creation for a
>            user without CAP_SYS_ADMIN.
>         4. The sleepable implementation loads

Sounds good!

> 
> Signed-off-by: Frederick Lawler <fred@cloudflare.com>
> 
> ---
> The generic deny_namespace file name is used for future namespace
> expansion. I didn't want to limit these files to just the create_user_ns
> hook.
> Changes since v2:
> - Rename create_user_ns hook to userns_create
> Changes since v1:
> - Introduce this patch
> ---
>  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>  2 files changed, 127 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/deny_namespace.c b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> new file mode 100644
> index 000000000000..9e4714295008
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> @@ -0,0 +1,88 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define _GNU_SOURCE
> +#include <test_progs.h>
> +#include "test_deny_namespace.skel.h"
> +#include <sched.h>
> +#include "cap_helpers.h"
> +
> +#define STACK_SIZE (1024 * 1024)
> +static char child_stack[STACK_SIZE];
> +
> +int clone_callback(void *arg)
> +{
> +	return 0;
> +}
> +
> +static int create_new_user_ns(void)
> +{
> +	int status;
> +	pid_t cpid;
> +
> +	cpid = clone(clone_callback, child_stack + STACK_SIZE,
> +		     CLONE_NEWUSER | SIGCHLD, NULL);
> +
> +	if (cpid == -1)
> +		return errno;
> +
> +	if (cpid == 0)
> +		return 0;

Martin asked about this already but fwiw, this cannot happen with
clone(). The clone() function doesn't return twice. It always returns
the PID of the child process or an error.

> +
> +	waitpid(cpid, &status, 0);
> +	if (WIFEXITED(status))
> +		return WEXITSTATUS(status);
> +
> +	return -1;
> +}

You can also just avoid the clone() dance and simply do sm like:

static int wait_for_pid(pid_t pid)
{
	int status, ret;

again:
	ret = waitpid(pid, &status, 0);
	if (ret == -1) {
		if (errno == EINTR)
			goto again;

		return -1;
	}

	if (!WIFEXITED(status))
		return -1;

	return WEXITSTATUS(status);
}

/* negative return value -> some internal error
 * positive return value -> userns creation failed
 * 0                     -> userns creation succeeded
 */
static int create_user_ns(void)
{
	pid_t pid;

	pid = fork();
	if (pid < 0)
		return -1;

	if (pid == 0) {
		if (unshare(CLONE_NEWUSER))
			_exit(EXIT_FAILURE);
		_exit(EXIT_SUCCESS);
	}

	return wait_for_pid(pid);
}

Same difference since both codepaths hit the right spot in the kernel.

> +
> +static void test_userns_create_bpf(void)
> +{
> +	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
> +	__u64 old_caps = 0;
> +
> +	ASSERT_OK(create_new_user_ns(), "priv new user ns");
> +
> +	cap_disable_effective(cap_mask, &old_caps);
> +
> +	ASSERT_EQ(create_new_user_ns(), EPERM, "unpriv new user ns");
> +
> +	if (cap_mask & old_caps)
> +		cap_enable_effective(cap_mask, NULL);
> +}
> +
> +static void test_unpriv_userns_create_no_bpf(void)
> +{
> +	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
> +	__u64 old_caps = 0;
> +
> +	cap_disable_effective(cap_mask, &old_caps);
> +
> +	ASSERT_OK(create_new_user_ns(), "no-bpf unpriv new user ns");
> +
> +	if (cap_mask & old_caps)
> +		cap_enable_effective(cap_mask, NULL);
> +}
> +
> +void test_deny_namespace(void)
> +{
> +	struct test_deny_namespace *skel = NULL;
> +	int err;
> +
> +	if (test__start_subtest("unpriv_userns_create_no_bpf"))
> +		test_unpriv_userns_create_no_bpf();
> +
> +	skel = test_deny_namespace__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel load"))
> +		goto close_prog;
> +
> +	err = test_deny_namespace__attach(skel);
> +	if (!ASSERT_OK(err, "attach"))
> +		goto close_prog;
> +
> +	if (test__start_subtest("userns_create_bpf"))
> +		test_userns_create_bpf();
> +
> +	test_deny_namespace__detach(skel);
> +
> +close_prog:
> +	test_deny_namespace__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_deny_namespace.c b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
> new file mode 100644
> index 000000000000..9ec9dabc8372
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <errno.h>
> +#include <linux/capability.h>
> +
> +struct kernel_cap_struct {
> +	__u32 cap[_LINUX_CAPABILITY_U32S_3];
> +} __attribute__((preserve_access_index));
> +
> +struct cred {
> +	struct kernel_cap_struct cap_effective;
> +} __attribute__((preserve_access_index));
> +
> +char _license[] SEC("license") = "GPL";
> +
> +SEC("lsm/userns_create")
> +int BPF_PROG(test_userns_create, const struct cred *cred, int ret)
> +{
> +	struct kernel_cap_struct caps = cred->cap_effective;
> +	int cap_index = CAP_TO_INDEX(CAP_SYS_ADMIN);
> +	__u32 cap_mask = CAP_TO_MASK(CAP_SYS_ADMIN);
> +
> +	if (ret)
> +		return 0;
> +
> +	ret = -EPERM;
> +	if (caps.cap[cap_index] & cap_mask)
> +		return 0;
> +
> +	return -EPERM;
> +}

Looks nice and simple.
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 2/4] bpf-lsm: Make bpf_lsm_userns_create() sleepable
  2022-07-21 17:28 ` [PATCH v3 2/4] bpf-lsm: Make bpf_lsm_userns_create() sleepable Frederick Lawler
@ 2022-07-22  8:18   ` Christian Brauner
  0 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2022-07-22  8:18 UTC (permalink / raw)
  To: Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On Thu, Jul 21, 2022 at 12:28:06PM -0500, Frederick Lawler wrote:
> Users may want to audit calls to security_create_user_ns() and access
> user space memory. Also create_user_ns() runs without
> pagefault_disabled(). Therefore, make bpf_lsm_userns_create() sleepable
> for mandatory access control policies.
> 
> Signed-off-by: Frederick Lawler <fred@cloudflare.com>
> 
> ---

Seems reasonable,
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 1/4] security, lsm: Introduce security_create_user_ns()
  2022-07-21 17:28 ` [PATCH v3 1/4] security, lsm: " Frederick Lawler
@ 2022-07-22  8:21   ` Christian Brauner
  0 siblings, 0 replies; 25+ messages in thread
From: Christian Brauner @ 2022-07-22  8:21 UTC (permalink / raw)
  To: Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On Thu, Jul 21, 2022 at 12:28:05PM -0500, Frederick Lawler wrote:
> Preventing user namespace (privileged or otherwise) creation comes in a
> few of forms in order of granularity:
> 
>         1. /proc/sys/user/max_user_namespaces sysctl
>         2. OS specific patch(es)
>         3. CONFIG_USER_NS
> 
> To block a task based on its attributes, the LSM hook cred_prepare is a
> good candidate for use because it provides more granular control, and
> it is called before create_user_ns():
> 
>         cred = prepare_creds()
>                 security_prepare_creds()
>                         call_int_hook(cred_prepare, ...
>         if (cred)
>                 create_user_ns(cred)
> 
> Since security_prepare_creds() is meant for LSMs to copy and prepare
> credentials, access control is an unintended use of the hook. Therefore
> introduce a new function security_create_user_ns() with an accompanying
> userns_create LSM hook.
> 
> This hook takes the prepared creds for LSM authors to write policy
> against. On success, the new namespace is applied to credentials,
> otherwise an error is returned.
> 
> Signed-off-by: Frederick Lawler <fred@cloudflare.com>
> 
> ---

Nice and straightforward,
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-22  6:11 ` [PATCH v3 0/4] Introduce security_create_user_ns() Martin KaFai Lau
@ 2022-07-22 12:20   ` Paul Moore
  2022-08-01 13:13     ` Frederick Lawler
  2022-08-02 21:33     ` Eric W. Biederman
  0 siblings, 2 replies; 25+ messages in thread
From: Paul Moore @ 2022-07-22 12:20 UTC (permalink / raw)
  To: Martin KaFai Lau, Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, songliubraving,
	yhs, john.fastabend, jmorris, serge, stephen.smalley.work,
	eparis, shuah, brauner, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:

> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>> While creating a LSM BPF MAC policy to block user namespace creation, we
>> used the LSM cred_prepare hook because that is the closest hook to prevent
>> a call to create_user_ns().
>>
>> The calls look something like this:
>>
>> cred = prepare_creds()
>> security_prepare_creds()
>> call_int_hook(cred_prepare, ...
>> if (cred)
>> create_user_ns(cred)
>>
>> We noticed that error codes were not propagated from this hook and
>> introduced a patch [1] to propagate those errors.
>>
>> The discussion notes that security_prepare_creds()
>> is not appropriate for MAC policies, and instead the hook is
>> meant for LSM authors to prepare credentials for mutation. [2]
>>
>> Ultimately, we concluded that a better course of action is to introduce
>> a new security hook for LSM authors. [3]
>>
>> This patch set first introduces a new security_create_user_ns() function
>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> Patch 1 and 4 still need review from the lsm/security side.


This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.

I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.

--
paul-moore.com



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook
  2022-07-22  6:07   ` Martin KaFai Lau
@ 2022-07-22 13:41     ` Frederick Lawler
  0 siblings, 0 replies; 25+ messages in thread
From: Frederick Lawler @ 2022-07-22 13:41 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, songliubraving,
	yhs, john.fastabend, jmorris, serge, paul, stephen.smalley.work,
	eparis, shuah, brauner, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On 7/22/22 1:07 AM, Martin KaFai Lau wrote:
> On Thu, Jul 21, 2022 at 12:28:07PM -0500, Frederick Lawler wrote:
>> The LSM hook userns_create was introduced to provide LSM's an
>> opportunity to block or allow unprivileged user namespace creation. This
>> test serves two purposes: it provides a test eBPF implementation, and
>> tests the hook successfully blocks or allows user namespace creation.
>>
>> This tests 4 cases:
>>
>>          1. Unattached bpf program does not block unpriv user namespace
>>             creation.
>>          2. Attached bpf program allows user namespace creation given
>>             CAP_SYS_ADMIN privileges.
>>          3. Attached bpf program denies user namespace creation for a
>>             user without CAP_SYS_ADMIN.
>>          4. The sleepable implementation loads
>>
>> Signed-off-by: Frederick Lawler <fred@cloudflare.com>
>>
>> ---
>> The generic deny_namespace file name is used for future namespace
>> expansion. I didn't want to limit these files to just the create_user_ns
>> hook.
>> Changes since v2:
>> - Rename create_user_ns hook to userns_create
>> Changes since v1:
>> - Introduce this patch
>> ---
>>   .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>>   .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>>   2 files changed, 127 insertions(+)
>>   create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>>   create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
>>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/deny_namespace.c b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>> new file mode 100644
>> index 000000000000..9e4714295008
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>> @@ -0,0 +1,88 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#define _GNU_SOURCE
>> +#include <test_progs.h>
>> +#include "test_deny_namespace.skel.h"
>> +#include <sched.h>
>> +#include "cap_helpers.h"
>> +
>> +#define STACK_SIZE (1024 * 1024)
> Does the child need 1M stack space ?

No, I can reduce that.

> 
>> +static char child_stack[STACK_SIZE];
>> +
>> +int clone_callback(void *arg)
> static
> 
>> +{
>> +	return 0;
>> +}
>> +
>> +static int create_new_user_ns(void)
>> +{
>> +	int status;
>> +	pid_t cpid;
>> +
>> +	cpid = clone(clone_callback, child_stack + STACK_SIZE,
>> +		     CLONE_NEWUSER | SIGCHLD, NULL);
>> +
>> +	if (cpid == -1)
>> +		return errno;
>> +
>> +	if (cpid == 0)
> Not an expert in clone() call and it is not clear what 0
> return value mean from the man page.  Could you explain ?
> 

Good catch. This is using the libc clone().

>> +		return 0;
>> +
>> +	waitpid(cpid, &status, 0);
>> +	if (WIFEXITED(status))
>> +		return WEXITSTATUS(status);
>> +
>> +	return -1;
>> +}
>> +
>> +static void test_userns_create_bpf(void)
>> +{
>> +	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
>> +	__u64 old_caps = 0;
>> +
>> +	ASSERT_OK(create_new_user_ns(), "priv new user ns");
> Does it need to enable CAP_SYS_ADMIN first ?
> 

You're right, this should be more explicitly set. I ran tests with the 
vmtest.sh script supplied with sefltests/bpf which run under root. I 
should always set CAP_SYS_ADMIN here to be consistent.

>> +
>> +	cap_disable_effective(cap_mask, &old_caps);
>> +
>> +	ASSERT_EQ(create_new_user_ns(), EPERM, "unpriv new user ns");
>> +
>> +	if (cap_mask & old_caps)
>> +		cap_enable_effective(cap_mask, NULL);
>> +}
>> +
>> +static void test_unpriv_userns_create_no_bpf(void)
>> +{
>> +	__u32 cap_mask = 1ULL << CAP_SYS_ADMIN;
>> +	__u64 old_caps = 0;
>> +
>> +	cap_disable_effective(cap_mask, &old_caps);
>> +
>> +	ASSERT_OK(create_new_user_ns(), "no-bpf unpriv new user ns");
>> +
>> +	if (cap_mask & old_caps)
>> +		cap_enable_effective(cap_mask, NULL);
>> +}
>> +
>> +void test_deny_namespace(void)
>> +{
>> +	struct test_deny_namespace *skel = NULL;
>> +	int err;
>> +
>> +	if (test__start_subtest("unpriv_userns_create_no_bpf"))
>> +		test_unpriv_userns_create_no_bpf();
>> +
>> +	skel = test_deny_namespace__open_and_load();
>> +	if (!ASSERT_OK_PTR(skel, "skel load"))
>> +		goto close_prog;
>> +
>> +	err = test_deny_namespace__attach(skel);
>> +	if (!ASSERT_OK(err, "attach"))
>> +		goto close_prog;
>> +
>> +	if (test__start_subtest("userns_create_bpf"))
>> +		test_userns_create_bpf();
>> +
>> +	test_deny_namespace__detach(skel);
>> +
>> +close_prog:
>> +	test_deny_namespace__destroy(skel);
>> +}
>> diff --git a/tools/testing/selftests/bpf/progs/test_deny_namespace.c b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
>> new file mode 100644
>> index 000000000000..9ec9dabc8372
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
>> @@ -0,0 +1,39 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#include <linux/bpf.h>
>> +#include <bpf/bpf_helpers.h>
>> +#include <bpf/bpf_tracing.h>
>> +#include <errno.h>
>> +#include <linux/capability.h>
>> +
>> +struct kernel_cap_struct {
>> +	__u32 cap[_LINUX_CAPABILITY_U32S_3];
>> +} __attribute__((preserve_access_index));
>> +
>> +struct cred {
>> +	struct kernel_cap_struct cap_effective;
>> +} __attribute__((preserve_access_index));
>> +
>> +char _license[] SEC("license") = "GPL";
>> +
>> +SEC("lsm/userns_create")
>> +int BPF_PROG(test_userns_create, const struct cred *cred, int ret)
>> +{
>> +	struct kernel_cap_struct caps = cred->cap_effective;
>> +	int cap_index = CAP_TO_INDEX(CAP_SYS_ADMIN);
>> +	__u32 cap_mask = CAP_TO_MASK(CAP_SYS_ADMIN);
>> +
>> +	if (ret)
>> +		return 0;
>> +
>> +	ret = -EPERM;
>> +	if (caps.cap[cap_index] & cap_mask)
>> +		return 0;
>> +
>> +	return -EPERM;
>> +}
>> +
>> +SEC("lsm.s/userns_create")
>> +int BPF_PROG(test_sleepable_userns_create, const struct cred *cred, int ret)
>> +{
> An empty program is weird.  If the intention is
> to ensure a sleepable program can attach to userns_create,
> move the test logic here and remove the non-sleepable
> program above.
> 

Sure, I can do that.

>> +	return 0;
>> +}
>> -- 
>> 2.30.2
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
                   ` (4 preceding siblings ...)
  2022-07-22  6:11 ` [PATCH v3 0/4] Introduce security_create_user_ns() Martin KaFai Lau
@ 2022-07-22 17:05 ` Eric W. Biederman
  2022-07-25 22:53   ` Paul Moore
                     ` (2 more replies)
  5 siblings, 3 replies; 25+ messages in thread
From: Eric W. Biederman @ 2022-07-22 17:05 UTC (permalink / raw)
  To: Frederick Lawler
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, kafai,
	songliubraving, yhs, john.fastabend, jmorris, serge, paul,
	stephen.smalley.work, eparis, shuah, brauner, casey, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

Frederick Lawler <fred@cloudflare.com> writes:

> While creating a LSM BPF MAC policy to block user namespace creation, we
> used the LSM cred_prepare hook because that is the closest hook to prevent
> a call to create_user_ns().

That description is wrong.  Your goal his is not to limit access to
the user namespace.  Your goal is to reduce the attack surface of the
kernel by not allowing some processes access to a user namespace.

You have already said that you don't have concerns about the
fundamentals of the user namespace, and what it enables only that
it allows access to exploitable code.

Achieving the protection you seek requires talking and thinking clearly
about the goal.




I have a couple of deep and fundamental problems with this approach,
to limiting access to potentially exploitable code.

1) The first is that unless there is a high probability (say 90%) that at
   any time the only exploitable code in the kernel can only be accessed
   by an unprivileged user with the help of user namespaces, attackers
   will just route around this restriction and so it will achieve
   nothing in practice, while at the same time incur an extra
   maintenance burden.

2) The second is that there is a long standing problem with code that
   gets added to the kernel.  Many times new kernel code because it has
   the potential to confuse suid root executables that code has been
   made root only.  Over time that results in more and more code running
   as root to be able to make use of the useful features of the linux
   kernel.

   One of the goals of the user namespace is to avoid more and more code
   migrating to running as root.  To achieve that goal ordinary
   application developers need to be able to assume that typically user
   namespaces will be available on linux.

   An assumption that ordinary applications like chromium make today.

   Your intentions seem to be to place a capability check so that only
   root can use user namespaces or something of the sort.  Thus breaking
   the general availability of user namespaces for ordinary applications
   on your systems.
   

My apologies if this has been addressed somewhere in the conversation
already.  I don't see these issues addressed in the descriptions of your
patches.

Until these issues are firmly addressed and you are not proposing a
patch that can only cause regressions in userspace applications.

Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

>
> The calls look something like this:
>
>     cred = prepare_creds()
>         security_prepare_creds()
>             call_int_hook(cred_prepare, ...
>     if (cred)
>         create_user_ns(cred)
>
> We noticed that error codes were not propagated from this hook and
> introduced a patch [1] to propagate those errors.
>
> The discussion notes that security_prepare_creds()
> is not appropriate for MAC policies, and instead the hook is
> meant for LSM authors to prepare credentials for mutation. [2]
>
> Ultimately, we concluded that a better course of action is to introduce
> a new security hook for LSM authors. [3]
>
> This patch set first introduces a new security_create_user_ns() function
> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>
> Links:
> 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
>
> Past discussions:
> V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
> V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/
>
> Changes since v2:
> - Rename create_user_ns hook to userns_create
> - Use user_namespace as an object opposed to a generic namespace object
> - s/domB_t/domA_t in commit message
> Changes since v1:
> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> - Add selinux: Implement create_user_ns hook patch
> - Change function signature of security_create_user_ns() to only take
>   struct cred
> - Move security_create_user_ns() call after id mapping check in
>   create_user_ns()
> - Update documentation to reflect changes
>
> Frederick Lawler (4):
>   security, lsm: Introduce security_create_user_ns()
>   bpf-lsm: Make bpf_lsm_userns_create() sleepable
>   selftests/bpf: Add tests verifying bpf lsm userns_create hook
>   selinux: Implement userns_create hook
>
>  include/linux/lsm_hook_defs.h                 |  1 +
>  include/linux/lsm_hooks.h                     |  4 +
>  include/linux/security.h                      |  6 ++
>  kernel/bpf/bpf_lsm.c                          |  1 +
>  kernel/user_namespace.c                       |  5 ++
>  security/security.c                           |  5 ++
>  security/selinux/hooks.c                      |  9 ++
>  security/selinux/include/classmap.h           |  2 +
>  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>  10 files changed, 160 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
>
> --
> 2.30.2

Eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-22 17:05 ` Eric W. Biederman
@ 2022-07-25 22:53   ` Paul Moore
  2022-07-26 12:02   ` Djalal Harouni
  2022-07-26 14:41   ` Ignat Korchagin
  2 siblings, 0 replies; 25+ messages in thread
From: Paul Moore @ 2022-07-25 22:53 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Frederick Lawler, kpsingh, revest, jackmanb, ast, daniel, andrii,
	kafai, songliubraving, yhs, john.fastabend, jmorris, serge,
	stephen.smalley.work, eparis, shuah, brauner, casey, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On Fri, Jul 22, 2022 at 1:05 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> Frederick Lawler <fred@cloudflare.com> writes:
>
> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > a call to create_user_ns().
>
> That description is wrong.  Your goal his is not to limit access to
> the user namespace.  Your goal is to reduce the attack surface of the
> kernel by not allowing some processes access to a user namespace.
>
> You have already said that you don't have concerns about the
> fundamentals of the user namespace, and what it enables only that
> it allows access to exploitable code.
>
> Achieving the protection you seek requires talking and thinking clearly
> about the goal.

Providing a single concrete goal for a LSM hook is always going to be
a challenge due to the nature of the LSM layer and the great unknown
of all the different LSMs that are implemented underneath the LSM
abstraction.  However, we can make some very general statements such
that up to this point the LSMs that have been merged into mainline
generally provide some level of access control, observability, or
both.  While that may change in the future (the LSM layer does not
attempt to restrict LSMs to just these two ideas), I think they are
"good enough" goals for this discussion.

In addition to thinking about these goals, I think it also important
to take a step back and think about the original motivation for the
LSM and why it, and Linux itself, has proven to be popular with
everything from the phone in your hand to the datacenter servers
powering ... pretty much everything :)  Arguably Linux has seen such
profound success because of its malleability; the open nature of the
kernel development process has allowed the Linux Kernel to adopt
capabilities well beyond what any one dev team could produce, and as
Linux continues to grow in adoption, its ability to flex into new use
cases only increases.  The kernel namespace concept is an excellent
example of this: virtualizing core kernel ideas, such as user
credentials, to provide better, safer solutions.  It is my belief that
the LSM layer is very much built around this same idea of abstracting
and extending core kernel concepts, in this case security controls, to
provide better solutions.  Integrating the LSM into the kernel's
namespaces is a natural fit, and one that is long overdue.

If we can't find a way to make everyone happy here, let's at least try
to find a way to make everyone "okay" with adding a LSM hook to the
user namespace.  If you want to NACK this approach Eric, that's okay,
but please provide some alternative paths forward that we can discuss.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-22 17:05 ` Eric W. Biederman
  2022-07-25 22:53   ` Paul Moore
@ 2022-07-26 12:02   ` Djalal Harouni
  2022-07-26 14:41   ` Ignat Korchagin
  2 siblings, 0 replies; 25+ messages in thread
From: Djalal Harouni @ 2022-07-26 12:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Frederick Lawler, KP Singh, revest, jackmanb, Alexei Starovoitov,
	Daniel Borkmann, andrii, kafai, songliubraving, yhs,
	john fastabend, James Morris, Serge E. Hallyn, Paul Moore,
	stephen.smalley.work, Eric Paris, Shuah Khan, brauner,
	Casey Schaufler, bpf, LSM List, selinux, linux-kselftest,
	linux-kernel, Network Development, kernel-team, cgzones, karl

Hi Eric,

On Fri, Jul 22, 2022 at 7:07 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Frederick Lawler <fred@cloudflare.com> writes:
>
> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > a call to create_user_ns().
>
> That description is wrong.  Your goal his is not to limit access to
> the user namespace.  Your goal is to reduce the attack surface of the
> kernel by not allowing some processes access to a user namespace.
>
> You have already said that you don't have concerns about the
> fundamentals of the user namespace, and what it enables only that
> it allows access to exploitable code.
>
> Achieving the protection you seek requires talking and thinking clearly
> about the goal.
>

We have valid use cases not specifically related to the attack surface,
but go into the middle from bpf observability to enforcement. As we want
to track namespace creation, changes, nesting and per task creds context
depending on the nature of the workload.

Obvious example is nesting as we want to track namespace creations
not necessarily user namespace but all to report hierarchies to dashboards,
then from kubernetes namespace view, we would like some applications to
setup namespaces privileged or not, but deny other apps creation of nested
pidns, userns, etc, it depends on users how they setup their kubernetes
namespaces and labels...

...
>
> 2) The second is that there is a long standing problem with code that
>    gets added to the kernel.  Many times new kernel code because it has
>    the potential to confuse suid root executables that code has been
>    made root only.  Over time that results in more and more code running
>    as root to be able to make use of the useful features of the linux
>    kernel.
>
>    One of the goals of the user namespace is to avoid more and more code
>    migrating to running as root.  To achieve that goal ordinary
>    application developers need to be able to assume that typically user
>    namespaces will be available on linux.
>
>    An assumption that ordinary applications like chromium make today.

I don't necessarily disagree with statement 2. and in a perfect world yes.
But practically as noted by Paul in his email, Linux is flexible and
speaking about kubernetes world we have multiple workload per
namespaces, and we would like a solution that we can support in the
next months.

Also these are features that some user space may use, some may not, we
will never be able to dictate to all user space applications how to do things.

From bpf side observability or bpf-lsm enforcement it allows to escalate how
to respond to the task and *make lsm and bpf (bpf-lsm) have a consistent
design* where they both follow the same path.

It is unfortunate that the security_task_alloc() [1] hook is  _late_ and can't
be used for context initialization as the credentials and even user namespace
have already been created. Strictly speaking we have a context that has
been already created and applied and we can't properly catch it !

There is no way to do that from user space as most bpf based tools
(observability and enforcement) do not and should not mess up at the
user space level with the namespace configuration of tasks (/proc...), they
are external programs to the running tasks, they do not set up the
environment. Having the hook before the namespaces and creds copying
allows to properly track this and construct the _right_ context. From lsm
and bpf-lsm this will definitely offer a better interface that is not prone to
errors.

We would like an answer here or an alternative hook that is placed before
the creation/setting of any namespace, credentials or creating new keyring.
So we can provide bpf-based transparent solutions that work.

[1] https://elixir.bootlin.com/linux/v5.18.13/source/kernel/fork.c#L2216



> My apologies if this has been addressed somewhere in the conversation
> already.  I don't see these issues addressed in the descriptions of your
> patches.
>
> Until these issues are firmly addressed and you are not proposing a
> patch that can only cause regressions in userspace applications.
>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> >
> > The calls look something like this:
> >
> >     cred = prepare_creds()
> >         security_prepare_creds()
> >             call_int_hook(cred_prepare, ...
> >     if (cred)
> >         create_user_ns(cred)
> >
> > We noticed that error codes were not propagated from this hook and
> > introduced a patch [1] to propagate those errors.
> >
> > The discussion notes that security_prepare_creds()
> > is not appropriate for MAC policies, and instead the hook is
> > meant for LSM authors to prepare credentials for mutation. [2]
> >
> > Ultimately, we concluded that a better course of action is to introduce
> > a new security hook for LSM authors. [3]
> >
> > This patch set first introduces a new security_create_user_ns() function
> > and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >
> > Links:
> > 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> > 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> > 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
> >
> > Past discussions:
> > V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
> > V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/
> >
> > Changes since v2:
> > - Rename create_user_ns hook to userns_create
> > - Use user_namespace as an object opposed to a generic namespace object
> > - s/domB_t/domA_t in commit message
> > Changes since v1:
> > - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> > - Add selinux: Implement create_user_ns hook patch
> > - Change function signature of security_create_user_ns() to only take
> >   struct cred
> > - Move security_create_user_ns() call after id mapping check in
> >   create_user_ns()
> > - Update documentation to reflect changes
> >
> > Frederick Lawler (4):
> >   security, lsm: Introduce security_create_user_ns()
> >   bpf-lsm: Make bpf_lsm_userns_create() sleepable
> >   selftests/bpf: Add tests verifying bpf lsm userns_create hook
> >   selinux: Implement userns_create hook
> >
> >  include/linux/lsm_hook_defs.h                 |  1 +
> >  include/linux/lsm_hooks.h                     |  4 +
> >  include/linux/security.h                      |  6 ++
> >  kernel/bpf/bpf_lsm.c                          |  1 +
> >  kernel/user_namespace.c                       |  5 ++
> >  security/security.c                           |  5 ++
> >  security/selinux/hooks.c                      |  9 ++
> >  security/selinux/include/classmap.h           |  2 +
> >  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
> >  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
> >  10 files changed, 160 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
> >
> > --
> > 2.30.2
>
> Eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-22 17:05 ` Eric W. Biederman
  2022-07-25 22:53   ` Paul Moore
  2022-07-26 12:02   ` Djalal Harouni
@ 2022-07-26 14:41   ` Ignat Korchagin
  2 siblings, 0 replies; 25+ messages in thread
From: Ignat Korchagin @ 2022-07-26 14:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Frederick Lawler, kpsingh, revest, jackmanb, ast, daniel, andrii,
	kafai, songliubraving, yhs, john.fastabend, jmorris, serge,
	Paul Moore, stephen.smalley.work, eparis, shuah,
	Christian Brauner, casey, bpf, linux-security-module, selinux,
	linux-kselftest, linux-kernel, netdev, kernel-team, cgzones,
	karl

On Fri, Jul 22, 2022 at 6:05 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Frederick Lawler <fred@cloudflare.com> writes:
>
> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > a call to create_user_ns().
>
> That description is wrong.  Your goal his is not to limit access to
> the user namespace.  Your goal is to reduce the attack surface of the
> kernel by not allowing some processes access to a user namespace.
>
> You have already said that you don't have concerns about the
> fundamentals of the user namespace, and what it enables only that
> it allows access to exploitable code.
>
> Achieving the protection you seek requires talking and thinking clearly
> about the goal.
>
>
>
>
> I have a couple of deep and fundamental problems with this approach,
> to limiting access to potentially exploitable code.
>
> 1) The first is that unless there is a high probability (say 90%) that at
>    any time the only exploitable code in the kernel can only be accessed
>    by an unprivileged user with the help of user namespaces, attackers
>    will just route around this restriction and so it will achieve
>    nothing in practice, while at the same time incur an extra
>    maintenance burden.
>
> 2) The second is that there is a long standing problem with code that
>    gets added to the kernel.  Many times new kernel code because it has
>    the potential to confuse suid root executables that code has been
>    made root only.  Over time that results in more and more code running
>    as root to be able to make use of the useful features of the linux
>    kernel.
>
>    One of the goals of the user namespace is to avoid more and more code
>    migrating to running as root.  To achieve that goal ordinary
>    application developers need to be able to assume that typically user
>    namespaces will be available on linux.
>
>    An assumption that ordinary applications like chromium make today.
>
>    Your intentions seem to be to place a capability check so that only
>    root can use user namespaces or something of the sort.  Thus breaking
>    the general availability of user namespaces for ordinary applications
>    on your systems.

I would like to comment here that our intention with the hook is quite
the opposite:
we do want to embrace user namespaces in our code and some of our workloads
already depend on it. Hence we didn't agree to Debian's approach of just
having a global sysctl. But there is "our code" and there is "third
party" code, which
might not even be open source due to various reasons. And while the path exists
for that code to do something bad - we want to block it.

So in a way, I think this hook allows better adoption of user
namespaces in the first
place and gives distros and other system maintainers a reasonable
alternative than
just providing a global "kill" sysctl (which is de-facto is used by
many, thus actually
limiting userspace applications accessing the user namespace functionality)

>
> My apologies if this has been addressed somewhere in the conversation
> already.  I don't see these issues addressed in the descriptions of your
> patches.
>
> Until these issues are firmly addressed and you are not proposing a
> patch that can only cause regressions in userspace applications.
>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> >
> > The calls look something like this:
> >
> >     cred = prepare_creds()
> >         security_prepare_creds()
> >             call_int_hook(cred_prepare, ...
> >     if (cred)
> >         create_user_ns(cred)
> >
> > We noticed that error codes were not propagated from this hook and
> > introduced a patch [1] to propagate those errors.
> >
> > The discussion notes that security_prepare_creds()
> > is not appropriate for MAC policies, and instead the hook is
> > meant for LSM authors to prepare credentials for mutation. [2]
> >
> > Ultimately, we concluded that a better course of action is to introduce
> > a new security hook for LSM authors. [3]
> >
> > This patch set first introduces a new security_create_user_ns() function
> > and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >
> > Links:
> > 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> > 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> > 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
> >
> > Past discussions:
> > V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
> > V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/
> >
> > Changes since v2:
> > - Rename create_user_ns hook to userns_create
> > - Use user_namespace as an object opposed to a generic namespace object
> > - s/domB_t/domA_t in commit message
> > Changes since v1:
> > - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> > - Add selinux: Implement create_user_ns hook patch
> > - Change function signature of security_create_user_ns() to only take
> >   struct cred
> > - Move security_create_user_ns() call after id mapping check in
> >   create_user_ns()
> > - Update documentation to reflect changes
> >
> > Frederick Lawler (4):
> >   security, lsm: Introduce security_create_user_ns()
> >   bpf-lsm: Make bpf_lsm_userns_create() sleepable
> >   selftests/bpf: Add tests verifying bpf lsm userns_create hook
> >   selinux: Implement userns_create hook
> >
> >  include/linux/lsm_hook_defs.h                 |  1 +
> >  include/linux/lsm_hooks.h                     |  4 +
> >  include/linux/security.h                      |  6 ++
> >  kernel/bpf/bpf_lsm.c                          |  1 +
> >  kernel/user_namespace.c                       |  5 ++
> >  security/security.c                           |  5 ++
> >  security/selinux/hooks.c                      |  9 ++
> >  security/selinux/include/classmap.h           |  2 +
> >  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
> >  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
> >  10 files changed, 160 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
> >
> > --
> > 2.30.2
>
> Eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-22 12:20   ` Paul Moore
@ 2022-08-01 13:13     ` Frederick Lawler
  2022-08-01 15:19       ` Paul Moore
  2022-08-01 15:25       ` Casey Schaufler
  2022-08-02 21:33     ` Eric W. Biederman
  1 sibling, 2 replies; 25+ messages in thread
From: Frederick Lawler @ 2022-08-01 13:13 UTC (permalink / raw)
  To: Paul Moore, Martin KaFai Lau
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, songliubraving,
	yhs, john.fastabend, jmorris, serge, stephen.smalley.work,
	eparis, shuah, brauner, casey, ebiederm, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl

On 7/22/22 7:20 AM, Paul Moore wrote:
> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> 
>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>> While creating a LSM BPF MAC policy to block user namespace creation, we
>>> used the LSM cred_prepare hook because that is the closest hook to prevent
>>> a call to create_user_ns().
>>>
>>> The calls look something like this:
>>>
>>> cred = prepare_creds()
>>> security_prepare_creds()
>>> call_int_hook(cred_prepare, ...
>>> if (cred)
>>> create_user_ns(cred)
>>>
>>> We noticed that error codes were not propagated from this hook and
>>> introduced a patch [1] to propagate those errors.
>>>
>>> The discussion notes that security_prepare_creds()
>>> is not appropriate for MAC policies, and instead the hook is
>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>
>>> Ultimately, we concluded that a better course of action is to introduce
>>> a new security hook for LSM authors. [3]
>>>
>>> This patch set first introduces a new security_create_user_ns() function
>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>> Patch 1 and 4 still need review from the lsm/security side.
> 
> 
> This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> 
> I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
> 

Based on last weeks comments, should I go ahead and put up v4 for 
5.20-rc1 when that drops, or do I need to wait for more feedback?

> --
> paul-moore.com
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-01 13:13     ` Frederick Lawler
@ 2022-08-01 15:19       ` Paul Moore
  2022-08-02 21:24         ` KP Singh
  2022-08-01 15:25       ` Casey Schaufler
  1 sibling, 1 reply; 25+ messages in thread
From: Paul Moore @ 2022-08-01 15:19 UTC (permalink / raw)
  To: Frederick Lawler, kpsingh
  Cc: Martin KaFai Lau, revest, jackmanb, ast, daniel, andrii,
	songliubraving, yhs, john.fastabend, jmorris, serge,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest,
	linux-kernel, netdev, kernel-team, cgzones, karl

On Mon, Aug 1, 2022 at 9:13 AM Frederick Lawler <fred@cloudflare.com> wrote:
> On 7/22/22 7:20 AM, Paul Moore wrote:
> > On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> >> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> >>> While creating a LSM BPF MAC policy to block user namespace creation, we
> >>> used the LSM cred_prepare hook because that is the closest hook to prevent
> >>> a call to create_user_ns().
> >>>
> >>> The calls look something like this:
> >>>
> >>> cred = prepare_creds()
> >>> security_prepare_creds()
> >>> call_int_hook(cred_prepare, ...
> >>> if (cred)
> >>> create_user_ns(cred)
> >>>
> >>> We noticed that error codes were not propagated from this hook and
> >>> introduced a patch [1] to propagate those errors.
> >>>
> >>> The discussion notes that security_prepare_creds()
> >>> is not appropriate for MAC policies, and instead the hook is
> >>> meant for LSM authors to prepare credentials for mutation. [2]
> >>>
> >>> Ultimately, we concluded that a better course of action is to introduce
> >>> a new security hook for LSM authors. [3]
> >>>
> >>> This patch set first introduces a new security_create_user_ns() function
> >>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >> Patch 1 and 4 still need review from the lsm/security side.
> >
> > This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> >
> > I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
>
> Based on last weeks comments, should I go ahead and put up v4 for
> 5.20-rc1 when that drops, or do I need to wait for more feedback?

In general it rarely hurts to make another revision, and I think
you've gotten some decent feedback on this draft, especially around
the BPF LSM tests; I think rebasing on Linus tree after the upcoming
io_uring changes are merged would be a good idea.  Although as a
reminder to the BPF LSM folks - I'm looking at you KP Singh :) - I
need an ACK from you guys before I merge the BPF related patches
(patches {2,3}/4).  For the record, I think the SELinux portion of
this patchset (path 4/4) is fine.

There is the issue of Eric's NACK, but I believe the responses that
followed his comment sufficiently addressed those concerns and it has
now been a week with no further comment from Eric; we should continue
to move forward with this.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-01 13:13     ` Frederick Lawler
  2022-08-01 15:19       ` Paul Moore
@ 2022-08-01 15:25       ` Casey Schaufler
  2022-08-01 16:34         ` Paul Moore
  1 sibling, 1 reply; 25+ messages in thread
From: Casey Schaufler @ 2022-08-01 15:25 UTC (permalink / raw)
  To: Frederick Lawler, Paul Moore, Martin KaFai Lau
  Cc: kpsingh, revest, jackmanb, ast, daniel, andrii, songliubraving,
	yhs, john.fastabend, jmorris, serge, stephen.smalley.work,
	eparis, shuah, brauner, ebiederm, bpf, linux-security-module,
	selinux, linux-kselftest, linux-kernel, netdev, kernel-team,
	cgzones, karl, casey

On 8/1/2022 6:13 AM, Frederick Lawler wrote:
> On 7/22/22 7:20 AM, Paul Moore wrote:
>> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
>>
>>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>>> While creating a LSM BPF MAC policy to block user namespace
>>>> creation, we
>>>> used the LSM cred_prepare hook because that is the closest hook to
>>>> prevent
>>>> a call to create_user_ns().
>>>>
>>>> The calls look something like this:
>>>>
>>>> cred = prepare_creds()
>>>> security_prepare_creds()
>>>> call_int_hook(cred_prepare, ...
>>>> if (cred)
>>>> create_user_ns(cred)
>>>>
>>>> We noticed that error codes were not propagated from this hook and
>>>> introduced a patch [1] to propagate those errors.
>>>>
>>>> The discussion notes that security_prepare_creds()
>>>> is not appropriate for MAC policies, and instead the hook is
>>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>>
>>>> Ultimately, we concluded that a better course of action is to
>>>> introduce
>>>> a new security hook for LSM authors. [3]
>>>>
>>>> This patch set first introduces a new security_create_user_ns()
>>>> function
>>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>>> Patch 1 and 4 still need review from the lsm/security side.
>>
>>
>> This patchset is in my review queue and assuming everything checks
>> out, I expect to merge it after the upcoming merge window closes.
>>
>> I would also need an ACK from the BPF LSM folks, but they're CC'd on
>> this patchset.
>>
>
> Based on last weeks comments, should I go ahead and put up v4 for
> 5.20-rc1 when that drops, or do I need to wait for more feedback?

As the primary consumer of this hook is BPF I would really expect their
reviewed-by before accepting this. 

>
>> -- 
>> paul-moore.com
>>
>>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-01 15:25       ` Casey Schaufler
@ 2022-08-01 16:34         ` Paul Moore
  2022-08-02 21:27           ` KP Singh
  0 siblings, 1 reply; 25+ messages in thread
From: Paul Moore @ 2022-08-01 16:34 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Frederick Lawler, Martin KaFai Lau, kpsingh, revest, jackmanb,
	ast, daniel, andrii, songliubraving, yhs, john.fastabend,
	jmorris, serge, stephen.smalley.work, eparis, shuah, brauner,
	ebiederm, bpf, linux-security-module, selinux, linux-kselftest,
	linux-kernel, netdev, kernel-team, cgzones, karl

On Mon, Aug 1, 2022 at 11:25 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 8/1/2022 6:13 AM, Frederick Lawler wrote:
> > On 7/22/22 7:20 AM, Paul Moore wrote:
> >> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >>
> >>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> >>>> While creating a LSM BPF MAC policy to block user namespace
> >>>> creation, we
> >>>> used the LSM cred_prepare hook because that is the closest hook to
> >>>> prevent
> >>>> a call to create_user_ns().
> >>>>
> >>>> The calls look something like this:
> >>>>
> >>>> cred = prepare_creds()
> >>>> security_prepare_creds()
> >>>> call_int_hook(cred_prepare, ...
> >>>> if (cred)
> >>>> create_user_ns(cred)
> >>>>
> >>>> We noticed that error codes were not propagated from this hook and
> >>>> introduced a patch [1] to propagate those errors.
> >>>>
> >>>> The discussion notes that security_prepare_creds()
> >>>> is not appropriate for MAC policies, and instead the hook is
> >>>> meant for LSM authors to prepare credentials for mutation. [2]
> >>>>
> >>>> Ultimately, we concluded that a better course of action is to
> >>>> introduce
> >>>> a new security hook for LSM authors. [3]
> >>>>
> >>>> This patch set first introduces a new security_create_user_ns()
> >>>> function
> >>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >>> Patch 1 and 4 still need review from the lsm/security side.
> >>
> >>
> >> This patchset is in my review queue and assuming everything checks
> >> out, I expect to merge it after the upcoming merge window closes.
> >>
> >> I would also need an ACK from the BPF LSM folks, but they're CC'd on
> >> this patchset.
> >
> > Based on last weeks comments, should I go ahead and put up v4 for
> > 5.20-rc1 when that drops, or do I need to wait for more feedback?
>
> As the primary consumer of this hook is BPF I would really expect their
> reviewed-by before accepting this.

We love all our in-tree LSMs equally.  As long as there is at least
one LSM which provides an implementation and has ACK'd the hook, and
no other LSMs have NACK'd the hook, then I have no problem merging it.
I doubt it will be necessary in this case, but if we need to tweak the
hook in the future we can definitely do that; we've done this in the
past when it has made sense.

As a reminder, the LSM hooks are *not* part of the "don't break
userspace" promise.  I know it gets a little muddy with the way the
BPF LSM works, but just as we don't want to allow one LSM to impact
the runtime controls on another, we don't want to allow one LSM to
freeze the hooks for everyone.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-01 15:19       ` Paul Moore
@ 2022-08-02 21:24         ` KP Singh
  2022-08-03  1:49           ` Paul Moore
  0 siblings, 1 reply; 25+ messages in thread
From: KP Singh @ 2022-08-02 21:24 UTC (permalink / raw)
  To: Paul Moore
  Cc: Frederick Lawler, Martin KaFai Lau, revest, jackmanb, ast,
	daniel, andrii, songliubraving, yhs, john.fastabend, jmorris,
	serge, stephen.smalley.work, eparis, shuah, brauner, casey,
	ebiederm, bpf, linux-security-module, selinux, linux-kselftest,
	linux-kernel, netdev, kernel-team, cgzones, karl

On Mon, Aug 1, 2022 at 5:19 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Aug 1, 2022 at 9:13 AM Frederick Lawler <fred@cloudflare.com> wrote:
> > On 7/22/22 7:20 AM, Paul Moore wrote:
> > > On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > >> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> > >>> While creating a LSM BPF MAC policy to block user namespace creation, we
> > >>> used the LSM cred_prepare hook because that is the closest hook to prevent
> > >>> a call to create_user_ns().
> > >>>
> > >>> The calls look something like this:
> > >>>
> > >>> cred = prepare_creds()
> > >>> security_prepare_creds()
> > >>> call_int_hook(cred_prepare, ...
> > >>> if (cred)
> > >>> create_user_ns(cred)
> > >>>
> > >>> We noticed that error codes were not propagated from this hook and
> > >>> introduced a patch [1] to propagate those errors.
> > >>>
> > >>> The discussion notes that security_prepare_creds()
> > >>> is not appropriate for MAC policies, and instead the hook is
> > >>> meant for LSM authors to prepare credentials for mutation. [2]
> > >>>
> > >>> Ultimately, we concluded that a better course of action is to introduce
> > >>> a new security hook for LSM authors. [3]
> > >>>
> > >>> This patch set first introduces a new security_create_user_ns() function
> > >>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> > >> Patch 1 and 4 still need review from the lsm/security side.
> > >
> > > This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> > >
> > > I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
> >
> > Based on last weeks comments, should I go ahead and put up v4 for
> > 5.20-rc1 when that drops, or do I need to wait for more feedback?
>
> In general it rarely hurts to make another revision, and I think
> you've gotten some decent feedback on this draft, especially around
> the BPF LSM tests; I think rebasing on Linus tree after the upcoming
> io_uring changes are merged would be a good idea.  Although as a
> reminder to the BPF LSM folks - I'm looking at you KP Singh :) - I
> need an ACK from you guys before I merge the BPF related patches

Apologies, I was on vacation. I am looking at the patches now.
Reviews and acks coming soon :)

- KP

> (patches {2,3}/4).  For the record, I think the SELinux portion of
> this patchset (path 4/4) is fine.
>

[...]

>
> --
> paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-01 16:34         ` Paul Moore
@ 2022-08-02 21:27           ` KP Singh
  0 siblings, 0 replies; 25+ messages in thread
From: KP Singh @ 2022-08-02 21:27 UTC (permalink / raw)
  To: Paul Moore
  Cc: Casey Schaufler, Frederick Lawler, Martin KaFai Lau, revest,
	jackmanb, ast, daniel, andrii, songliubraving, yhs,
	john.fastabend, jmorris, serge, stephen.smalley.work, eparis,
	shuah, brauner, ebiederm, bpf, linux-security-module, selinux,
	linux-kselftest, linux-kernel, netdev, kernel-team, cgzones,
	karl

On Mon, Aug 1, 2022 at 6:35 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Aug 1, 2022 at 11:25 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > On 8/1/2022 6:13 AM, Frederick Lawler wrote:
> > > On 7/22/22 7:20 AM, Paul Moore wrote:
> > >> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > >>
> > >>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> > >>>> While creating a LSM BPF MAC policy to block user namespace
> > >>>> creation, we
> > >>>> used the LSM cred_prepare hook because that is the closest hook to
> > >>>> prevent
> > >>>> a call to create_user_ns().
> > >>>>
> > >>>> The calls look something like this:
> > >>>>
> > >>>> cred = prepare_creds()
> > >>>> security_prepare_creds()
> > >>>> call_int_hook(cred_prepare, ...
> > >>>> if (cred)
> > >>>> create_user_ns(cred)
> > >>>>
> > >>>> We noticed that error codes were not propagated from this hook and
> > >>>> introduced a patch [1] to propagate those errors.
> > >>>>
> > >>>> The discussion notes that security_prepare_creds()
> > >>>> is not appropriate for MAC policies, and instead the hook is
> > >>>> meant for LSM authors to prepare credentials for mutation. [2]
> > >>>>
> > >>>> Ultimately, we concluded that a better course of action is to
> > >>>> introduce
> > >>>> a new security hook for LSM authors. [3]
> > >>>>
> > >>>> This patch set first introduces a new security_create_user_ns()
> > >>>> function
> > >>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> > >>> Patch 1 and 4 still need review from the lsm/security side.
> > >>
> > >>
> > >> This patchset is in my review queue and assuming everything checks
> > >> out, I expect to merge it after the upcoming merge window closes.
> > >>
> > >> I would also need an ACK from the BPF LSM folks, but they're CC'd on
> > >> this patchset.
> > >
> > > Based on last weeks comments, should I go ahead and put up v4 for
> > > 5.20-rc1 when that drops, or do I need to wait for more feedback?
> >
> > As the primary consumer of this hook is BPF I would really expect their
> > reviewed-by before accepting this.
>
> We love all our in-tree LSMs equally.  As long as there is at least
> one LSM which provides an implementation and has ACK'd the hook, and
> no other LSMs have NACK'd the hook, then I have no problem merging it.
> I doubt it will be necessary in this case, but if we need to tweak the
> hook in the future we can definitely do that; we've done this in the
> past when it has made sense.
>
> As a reminder, the LSM hooks are *not* part of the "don't break
> userspace" promise.  I know it gets a little muddy with the way the

That's correct. Also, with BPF LSM, we encourage users to
build the application / bpf program logic to be resilient to changes
in the LSM hooks.

I am happy to share how we've done it, if folks are interested.

- KP

> BPF LSM works, but just as we don't want to allow one LSM to impact
> the runtime controls on another, we don't want to allow one LSM to
> freeze the hooks for everyone.
>
> --
> paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-07-22 12:20   ` Paul Moore
  2022-08-01 13:13     ` Frederick Lawler
@ 2022-08-02 21:33     ` Eric W. Biederman
  2022-08-03 15:20       ` Frederick Lawler
  1 sibling, 1 reply; 25+ messages in thread
From: Eric W. Biederman @ 2022-08-02 21:33 UTC (permalink / raw)
  To: Paul Moore
  Cc: Martin KaFai Lau, Frederick Lawler, kpsingh, revest, jackmanb,
	ast, daniel, andrii, songliubraving, yhs, john.fastabend,
	jmorris, serge, stephen.smalley.work, eparis, shuah, brauner,
	casey, bpf, linux-security-module, selinux, linux-kselftest,
	linux-kernel, netdev, kernel-team, cgzones, karl

Paul Moore <paul@paul-moore.com> writes:

> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>> While creating a LSM BPF MAC policy to block user namespace creation, we
>>> used the LSM cred_prepare hook because that is the closest hook to prevent
>>> a call to create_user_ns().
>>>
>>> The calls look something like this:
>>>
>>> cred = prepare_creds()
>>> security_prepare_creds()
>>> call_int_hook(cred_prepare, ...
>>> if (cred)
>>> create_user_ns(cred)
>>>
>>> We noticed that error codes were not propagated from this hook and
>>> introduced a patch [1] to propagate those errors.
>>>
>>> The discussion notes that security_prepare_creds()
>>> is not appropriate for MAC policies, and instead the hook is
>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>
>>> Ultimately, we concluded that a better course of action is to introduce
>>> a new security hook for LSM authors. [3]
>>>
>>> This patch set first introduces a new security_create_user_ns() function
>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>> Patch 1 and 4 still need review from the lsm/security side.
>
>
> This patchset is in my review queue and assuming everything checks
> out, I expect to merge it after the upcoming merge window closes.

It doesn't even address my issues with the last patchset.

So it has my NACK.

Eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-02 21:24         ` KP Singh
@ 2022-08-03  1:49           ` Paul Moore
  0 siblings, 0 replies; 25+ messages in thread
From: Paul Moore @ 2022-08-03  1:49 UTC (permalink / raw)
  To: KP Singh, Frederick Lawler
  Cc: Martin KaFai Lau, revest, jackmanb, ast, daniel, andrii,
	songliubraving, yhs, john.fastabend, jmorris, serge,
	stephen.smalley.work, eparis, shuah, brauner, casey, ebiederm,
	bpf, linux-security-module, selinux, linux-kselftest,
	linux-kernel, netdev, kernel-team, cgzones, karl

On Tue, Aug 2, 2022 at 5:25 PM KP Singh <kpsingh@kernel.org> wrote:
> On Mon, Aug 1, 2022 at 5:19 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Aug 1, 2022 at 9:13 AM Frederick Lawler <fred@cloudflare.com> wrote:
> > > On 7/22/22 7:20 AM, Paul Moore wrote:
> > > > On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > > >
> > > >> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> > > >>> While creating a LSM BPF MAC policy to block user namespace creation, we
> > > >>> used the LSM cred_prepare hook because that is the closest hook to prevent
> > > >>> a call to create_user_ns().
> > > >>>
> > > >>> The calls look something like this:
> > > >>>
> > > >>> cred = prepare_creds()
> > > >>> security_prepare_creds()
> > > >>> call_int_hook(cred_prepare, ...
> > > >>> if (cred)
> > > >>> create_user_ns(cred)
> > > >>>
> > > >>> We noticed that error codes were not propagated from this hook and
> > > >>> introduced a patch [1] to propagate those errors.
> > > >>>
> > > >>> The discussion notes that security_prepare_creds()
> > > >>> is not appropriate for MAC policies, and instead the hook is
> > > >>> meant for LSM authors to prepare credentials for mutation. [2]
> > > >>>
> > > >>> Ultimately, we concluded that a better course of action is to introduce
> > > >>> a new security hook for LSM authors. [3]
> > > >>>
> > > >>> This patch set first introduces a new security_create_user_ns() function
> > > >>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> > > >> Patch 1 and 4 still need review from the lsm/security side.
> > > >
> > > > This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> > > >
> > > > I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
> > >
> > > Based on last weeks comments, should I go ahead and put up v4 for
> > > 5.20-rc1 when that drops, or do I need to wait for more feedback?
> >
> > In general it rarely hurts to make another revision, and I think
> > you've gotten some decent feedback on this draft, especially around
> > the BPF LSM tests; I think rebasing on Linus tree after the upcoming
> > io_uring changes are merged would be a good idea.

As I was typing up my reply I realized I mistakenly mentioned the
io_uring changes that Linus just merged today - oops!  If you haven't
figured it out already, you can disregard that comment, that's a
completely different problem and a completely different set of patches
:)

> > Although as a
> > reminder to the BPF LSM folks - I'm looking at you KP Singh :) - I
> > need an ACK from you guys before I merge the BPF related patches
>
> Apologies, I was on vacation. I am looking at the patches now.
> Reviews and acks coming soon :)

No worries, we've still got the two weeks of the merge window before I
can do anything into linux-next - thanks KP!

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 0/4] Introduce security_create_user_ns()
  2022-08-02 21:33     ` Eric W. Biederman
@ 2022-08-03 15:20       ` Frederick Lawler
  0 siblings, 0 replies; 25+ messages in thread
From: Frederick Lawler @ 2022-08-03 15:20 UTC (permalink / raw)
  To: Eric W. Biederman, Paul Moore
  Cc: Martin KaFai Lau, kpsingh, revest, jackmanb, ast, daniel, andrii,
	songliubraving, yhs, john.fastabend, jmorris, serge,
	stephen.smalley.work, eparis, shuah, brauner, casey, bpf,
	linux-security-module, selinux, linux-kselftest, linux-kernel,
	netdev, kernel-team, cgzones, karl, tixxdz

On 8/2/22 4:33 PM, Eric W. Biederman wrote:
> Paul Moore <paul@paul-moore.com> writes:
> 
>> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
>>
>>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>>> While creating a LSM BPF MAC policy to block user namespace creation, we
>>>> used the LSM cred_prepare hook because that is the closest hook to prevent
>>>> a call to create_user_ns().
>>>>
>>>> The calls look something like this:
>>>>
>>>> cred = prepare_creds()
>>>> security_prepare_creds()
>>>> call_int_hook(cred_prepare, ...
>>>> if (cred)
>>>> create_user_ns(cred)
>>>>
>>>> We noticed that error codes were not propagated from this hook and
>>>> introduced a patch [1] to propagate those errors.
>>>>
>>>> The discussion notes that security_prepare_creds()
>>>> is not appropriate for MAC policies, and instead the hook is
>>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>>
>>>> Ultimately, we concluded that a better course of action is to introduce
>>>> a new security hook for LSM authors. [3]
>>>>
>>>> This patch set first introduces a new security_create_user_ns() function
>>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>>> Patch 1 and 4 still need review from the lsm/security side.
>>
>>
>> This patchset is in my review queue and assuming everything checks
>> out, I expect to merge it after the upcoming merge window closes.
> 
> It doesn't even address my issues with the last patchset.

Are you referring to [1], and with regards to [2], is the issue that the 
wording could be improved for both the cover letter and patch 1/4?

Ultimately, the goal of CF is to leverage and use user namespaces and 
block tasks whose meta information do not align with our allow list 
criteria. Yes, there is a higher goal of restricting our attack surface. 
Yes, people will find ways around security. The point is to have 
multiple levels of security, and this patch series allows people to add 
another level.

Calling this hook a regression is not true since there's no actual 
regression in the code. What would constitute a perceived regression is 
an admin imposing such a SELinux or BPF restriction within their 
company, but developers in that company ideally would try to work with 
the admin to enable user namespaces for certain use cases, or 
alternatively do what you don't want given current tooling: always run 
code as root. That's where this hook comes in: let people observe and 
enforce how they see fit. The average enthusiasts would see no impact.

I was requested to add _some_ test to BPF and to add a SELinux 
implementation. The low hanging fruit for a test to prove that the hook 
is capable of doing _something_ was to simply just block outright, and 
provide _some example_ of use. It doesn't make sense for us to write a 
test that outlines specifically what CF or others are doing because that 
would put too much emphasis on an implementation detail that doesn't 
matter to prove that the hook works.

Without Djalal's comment, I can't defend an observability use case that 
we're not currently leveraging. We have it now, so therefore I'll defend 
it per KP's suggestion[3] in v5.

By not responding to the email discussions, we can't accurately gauge 
what should or should not be in the descriptions. No one here 
necessarily disagrees with some of the points you made, and others have 
appropriately responded. As others have also wrote, you're not proposing 
alternatives. How do you expect us to work with that?

Please, let us know which bits and pieces ought to be included in the 
descriptions, and let us know what things we should call out caveats to 
that would satisfy your concerns.

Links:
1. 
https://lore.kernel.org/all/01368386-521f-230b-1d49-de19377c27d1@cloudflare.com/
2. 
https://lore.kernel.org/all/877d45kri4.fsf@email.froward.int.ebiederm.org/#t
3. 
https://lore.kernel.org/all/CACYkzJ4x90DamdN4dRCn1gZuAHLqJNy4MoP=qTX+44Bqx1uxSQ@mail.gmail.com/
4. 
https://lore.kernel.org/all/CAEiveUdPhEPAk7Y0ZXjPsD=Vb5hn453CHzS9aG-tkyRa8bf_eg@mail.gmail.com/#t

> 
> So it has my NACK.
> 
> Eric


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-08-03 15:20 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-21 17:28 [PATCH v3 0/4] Introduce security_create_user_ns() Frederick Lawler
2022-07-21 17:28 ` [PATCH v3 1/4] security, lsm: " Frederick Lawler
2022-07-22  8:21   ` Christian Brauner
2022-07-21 17:28 ` [PATCH v3 2/4] bpf-lsm: Make bpf_lsm_userns_create() sleepable Frederick Lawler
2022-07-22  8:18   ` Christian Brauner
2022-07-21 17:28 ` [PATCH v3 3/4] selftests/bpf: Add tests verifying bpf lsm userns_create hook Frederick Lawler
2022-07-22  6:07   ` Martin KaFai Lau
2022-07-22 13:41     ` Frederick Lawler
2022-07-22  8:15   ` Christian Brauner
2022-07-21 17:28 ` [PATCH v3 4/4] selinux: Implement " Frederick Lawler
2022-07-22  6:11 ` [PATCH v3 0/4] Introduce security_create_user_ns() Martin KaFai Lau
2022-07-22 12:20   ` Paul Moore
2022-08-01 13:13     ` Frederick Lawler
2022-08-01 15:19       ` Paul Moore
2022-08-02 21:24         ` KP Singh
2022-08-03  1:49           ` Paul Moore
2022-08-01 15:25       ` Casey Schaufler
2022-08-01 16:34         ` Paul Moore
2022-08-02 21:27           ` KP Singh
2022-08-02 21:33     ` Eric W. Biederman
2022-08-03 15:20       ` Frederick Lawler
2022-07-22 17:05 ` Eric W. Biederman
2022-07-25 22:53   ` Paul Moore
2022-07-26 12:02   ` Djalal Harouni
2022-07-26 14:41   ` Ignat Korchagin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).