linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns
@ 2021-11-30 16:06 Stefan Berger
  2021-11-30 16:06 ` [RFC 01/20] ima: Add IMA namespace support Stefan Berger
                   ` (19 more replies)
  0 siblings, 20 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

The goal of this series of patches is to start with the namespacing of
IMA and support auditing within an IMA namespace (IMA-ns) as the first
step.

In this series the IMA namespace is piggy backing on the user namespace
and therefore an IMA namespace gets created when a user namespace is
created. The advantage of this is that the user namespace can provide
the keys infrastructure that IMA appraisal support will need later on.

We chose the goal of supporting auditing within an IMA namespace since it
requires the least changes to IMA. Following this series, auditing within
an IMA namespace can be activated by a user running the following lines
that rely on a statically linked busybox to be installed on the host for
execution within the minimal container environment:

mkdir -p rootfs/{bin,mnt,proc}
cp /sbin/busybox rootfs/bin
PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
  --root rootfs busybox sh -c \
 "busybox mount -t securityfs_ns /mnt /mnt; \
  busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
  busybox cat /mnt/ima/policy"

Following the audit log on the host the last line cat'ing the IMA policy
inside the namespace would have been audited. Unfortunately the auditing
line is not distinguishable from one stemming from actions on the host.
The hope here is that Richard Brigg's container id support for auditing
would help resolve the problem.

The following lines added to a suitable IMA policy on the host would
cause the execution of the commands inside the container (by uid 1000)
to be measured and audited as well on the host, thus leading to two
auditing messages for the 'busybox cat' above and log entries in IMA's
system log.

echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
        "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
    > /sys/kernel/security/ima/policy

The goal of supporting measurement and auditing by the host, of actions
occurring within IMA namespaces, is that users, particularly root,
should not be able to evade the host's IMA policy just by spawning
new IMA namespaces, running programs there, and discarding the namespaces
again. This is achieved through 'hierarchical processing' of file
accesses that are evaluated against the policy of the namespace where
the action occurred and against all namespaces' and their policies leading
back to the root IMA namespace (init_ima_ns).

The patch series adds support for virtualizing SecurityFS and introduces
a derivative of SecurityFS called 'securityfs_ns' whose goal is to
only display the data relevant to the IMA namespace rather than showing
files and directories of other security subsystems (TPM, evm, Tomoyo,
safesetid) that use the existing SecurityFS and that would show
if one was to bind-mount SecurityFS into a mount namespace.

Much of the code leading up to the virtualization of SecurityFS deals
with moving IMA's variables from various files into the IMA namespace
structure called 'ima_namespace'. When it comes to determining the
current IMA namespace I took the approach to get the current IMA
namespace (get_current_ns()) on the top level and pass the pointer all
the way down to those functions that now need access to the ima_namespace
to get to their variables. This later on comes in handy once hierarchical
processing is implemented in this series where we walk the list of
namespaces backwards and again need to pass the pointer into functions.

A side-effect of IMA being a child of the user namespace becomes apparent
when virtualizing SecurityFS which has the effect that the filesystem code
needs to take an additional reference to the user namespace (for the keyed
filesystem), which in turn leads to the problem that the one additional
reference doesn't allow the user namespace to be deleted. The work-around
for this is to introduce an early teardown reference counter and related
IMA function that gets invoked when the user namespace reference counter
has reached a certain value, '1'. Freeing the filesystem then finally also
leads to the deletion of the user namespace.

This patch also introduces CAP_INTEGRITY_ADMIN as a subset of
CAP_SYS_ADMIN's capabilities that allows access to the IMA policy via
reduced capabilities. We would again later on use this capability to allow
users to set file extended attributes for IMA appraisal support.

The basis for this series of patches is Linux v5.15.
My tree with these patches is here:
https://github.com/stefanberger/linux-ima-namespaces/tree/v5.15%2Bimans.20211119.v4

Regards,
   Stefan


Denis Semakin (2):
  capabilities: Introduce CAP_INTEGRITY_ADMIN
  ima: Use integrity_admin_ns_capable() to check corresponding
    capability

Mehmet Kayaalp (2):
  ima: Define ns_status for storing namespaced iint data
  ima: Namespace audit status flags

Stefan Berger (16):
  ima: Add IMA namespace support
  ima: Move delayed work queue and variables into ima_namespace
  ima: Move IMA's keys queue related variables into ima_namespace
  ima: Move policy related variables into ima_namespace
  ima: Move ima_htable into ima_namespace
  ima: Move measurement list related variables into ima_namespace
  ima: Only accept AUDIT rules for IMA non-init_ima_ns namespaces for
    now
  ima: Implement hierarchical processing of file accesses
  securityfs: Prefix global variables with securityfs_
  securityfs: Pass static variables as parameters from top level
    functions
  securityfs: Build securityfs_ns for namespacing support
  ima: Move some IMA policy and filesystem related variables into
    ima_namespace
  ima: Use ns_capable() for namespace policy access
  userns: Introduce a refcount variable for calling early teardown
    function
  ima/userns: Define early teardown function for IMA namespace
  ima: Setup securityfs_ns for IMA namespace

 include/linux/capability.h                   |   6 +
 include/linux/ima.h                          | 132 ++++++++-
 include/linux/security.h                     |  18 ++
 include/linux/user_namespace.h               |  21 +-
 include/uapi/linux/capability.h              |   7 +-
 include/uapi/linux/magic.h                   |   1 +
 init/Kconfig                                 |  12 +
 kernel/user.c                                |   9 +-
 kernel/user_namespace.c                      |  16 +
 security/inode.c                             | 290 ++++++++++++++++---
 security/integrity/ima/Makefile              |   4 +-
 security/integrity/ima/ima.h                 | 147 ++++++----
 security/integrity/ima/ima_api.c             |  33 ++-
 security/integrity/ima/ima_appraise.c        |  26 +-
 security/integrity/ima/ima_asymmetric_keys.c |   8 +-
 security/integrity/ima/ima_fs.c              | 236 +++++++++++++--
 security/integrity/ima/ima_init.c            |  14 +-
 security/integrity/ima/ima_init_ima_ns.c     |  77 +++++
 security/integrity/ima/ima_main.c            | 128 +++++---
 security/integrity/ima/ima_ns.c              | 119 ++++++++
 security/integrity/ima/ima_ns_status.c       | 132 +++++++++
 security/integrity/ima/ima_policy.c          | 142 +++++----
 security/integrity/ima/ima_queue.c           |  75 +++--
 security/integrity/ima/ima_queue_keys.c      |  73 ++---
 security/integrity/ima/ima_template.c        |   4 +-
 security/selinux/include/classmap.h          |   4 +-
 26 files changed, 1394 insertions(+), 340 deletions(-)
 create mode 100644 security/integrity/ima/ima_init_ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns_status.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC 01/20] ima: Add IMA namespace support
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 02/20] ima: Define ns_status for storing namespaced iint data Stefan Berger
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger, James Bottomley

Implement an IMA namespace data structure that gets created alongside a
user namespace with CLONE_NEWUSER. This lays down the foundation for
namespacing the different aspects of IMA (eg. IMA-audit, IMA-measurement,
IMA-appraisal).

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 include/linux/ima.h                      | 59 +++++++++++++++++
 include/linux/user_namespace.h           |  4 ++
 init/Kconfig                             |  9 +++
 kernel/user.c                            |  9 ++-
 kernel/user_namespace.c                  | 16 +++++
 security/integrity/ima/Makefile          |  3 +-
 security/integrity/ima/ima.h             |  4 ++
 security/integrity/ima/ima_init.c        |  4 ++
 security/integrity/ima/ima_init_ima_ns.c | 32 +++++++++
 security/integrity/ima/ima_ns.c          | 82 ++++++++++++++++++++++++
 10 files changed, 220 insertions(+), 2 deletions(-)
 create mode 100644 security/integrity/ima/ima_init_ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns.c

diff --git a/include/linux/ima.h b/include/linux/ima.h
index b6ab66a546ae..86d126b9ff2f 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -11,6 +11,7 @@
 #include <linux/fs.h>
 #include <linux/security.h>
 #include <linux/kexec.h>
+#include <linux/user_namespace.h>
 #include <crypto/hash_info.h>
 struct linux_binprm;
 
@@ -210,6 +211,64 @@ static inline int ima_inode_removexattr(struct dentry *dentry,
 }
 #endif /* CONFIG_IMA_APPRAISE */
 
+struct ima_namespace {
+	struct kref kref;
+	struct user_namespace *user_ns;
+};
+
+extern struct ima_namespace init_ima_ns;
+
+#ifdef CONFIG_IMA_NS
+
+void free_ima_ns(struct kref *kref);
+
+static inline struct ima_namespace *get_ima_ns(struct ima_namespace *ns)
+{
+	if (ns)
+		kref_get(&ns->kref);
+
+	return ns;
+}
+
+static inline void put_ima_ns(struct ima_namespace *ns)
+{
+	if (ns) {
+		pr_debug("DEREF   ima_ns: 0x%p  ctr: %d\n", ns, kref_read(&ns->kref));
+		kref_put(&ns->kref, free_ima_ns);
+	}
+}
+
+struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
+				  struct user_namespace *user_ns);
+
+static inline struct ima_namespace *get_current_ns(void)
+{
+	return current_user_ns()->ima_ns;
+}
+
+#else
+
+static inline struct ima_namespace *get_ima_ns(struct ima_namespace *ns)
+{
+	return ns;
+}
+
+static inline void put_ima_ns(struct ima_namespace *ns)
+{
+}
+
+static inline struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
+						struct user_namespace *user_ns)
+{
+	return old_ns;
+}
+
+static inline struct ima_namespace *get_current_ns(void)
+{
+	return &init_ima_ns;
+}
+#endif /* CONFIG_IMA_NS */
+
 #if defined(CONFIG_IMA_APPRAISE) && defined(CONFIG_INTEGRITY_TRUSTED_KEYRING)
 extern bool ima_appraise_signature(enum kernel_read_file_id func);
 #else
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 33a4240e6a6f..5249db04d62b 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -36,6 +36,7 @@ struct uid_gid_map { /* 64 bytes -- 1 cache line */
 #define USERNS_INIT_FLAGS USERNS_SETGROUPS_ALLOWED
 
 struct ucounts;
+struct ima_namespace;
 
 enum ucount_type {
 	UCOUNT_USER_NAMESPACES,
@@ -99,6 +100,9 @@ struct user_namespace {
 #endif
 	struct ucounts		*ucounts;
 	long ucount_max[UCOUNT_COUNTS];
+#ifdef CONFIG_IMA
+	struct ima_namespace	*ima_ns;
+#endif
 } __randomize_layout;
 
 struct ucounts {
diff --git a/init/Kconfig b/init/Kconfig
index 11f8a845f259..598baf451a54 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1242,6 +1242,15 @@ config NET_NS
 	  Allow user space to create what appear to be multiple instances
 	  of the network stack.
 
+config IMA_NS
+	bool "IMA namespace"
+	depends on IMA
+	default y
+	help
+	  Allow the creation of IMA namespaces for each user namespace.
+	  Namespaced IMA enables having IMA features work separately
+	  in each IMA namespace.
+
 endif # NAMESPACES
 
 config CHECKPOINT_RESTORE
diff --git a/kernel/user.c b/kernel/user.c
index e2cf8c22b539..b5dc803a033d 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -20,6 +20,10 @@
 #include <linux/user_namespace.h>
 #include <linux/proc_ns.h>
 
+#ifdef CONFIG_IMA
+extern struct ima_namespace init_ima_ns;
+#endif
+
 /*
  * userns count is 1 for root user, 1 for init_uts_ns,
  * and 1 for... ?
@@ -55,7 +59,7 @@ struct user_namespace init_user_ns = {
 			},
 		},
 	},
-	.ns.count = REFCOUNT_INIT(3),
+	.ns.count = REFCOUNT_INIT(4),
 	.owner = GLOBAL_ROOT_UID,
 	.group = GLOBAL_ROOT_GID,
 	.ns.inum = PROC_USER_INIT_INO,
@@ -67,6 +71,9 @@ struct user_namespace init_user_ns = {
 	.keyring_name_list = LIST_HEAD_INIT(init_user_ns.keyring_name_list),
 	.keyring_sem = __RWSEM_INITIALIZER(init_user_ns.keyring_sem),
 #endif
+#ifdef CONFIG_IMA
+	.ima_ns = &init_ima_ns,
+#endif
 };
 EXPORT_SYMBOL_GPL(init_user_ns);
 
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 6b2e3ca7ee99..c26885343b19 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -20,6 +20,7 @@
 #include <linux/fs_struct.h>
 #include <linux/bsearch.h>
 #include <linux/sort.h>
+#include <linux/ima.h>
 
 static struct kmem_cache *user_ns_cachep __read_mostly;
 static DEFINE_MUTEX(userns_state_mutex);
@@ -141,8 +142,20 @@ int create_user_ns(struct cred *new)
 	if (!setup_userns_sysctls(ns))
 		goto fail_keyring;
 
+#if CONFIG_IMA
+	ns->ima_ns = copy_ima_ns(parent_ns->ima_ns, ns);
+	if (IS_ERR(ns->ima_ns)) {
+		ret = PTR_ERR(ns->ima_ns);
+		goto fail_userns_sysctls;
+	}
+#endif
+
 	set_cred_user_ns(new, ns);
 	return 0;
+#if CONFIG_IMA
+fail_userns_sysctls:
+	retire_userns_sysctls(ns);
+#endif
 fail_keyring:
 #ifdef CONFIG_PERSISTENT_KEYRINGS
 	key_put(ns->persistent_keyring_register);
@@ -196,6 +209,9 @@ static void free_user_ns(struct work_struct *work)
 			kfree(ns->projid_map.forward);
 			kfree(ns->projid_map.reverse);
 		}
+#ifdef CONFIG_IMA
+		put_ima_ns(ns->ima_ns);
+#endif
 		retire_userns_sysctls(ns);
 		key_free_user_ns(ns);
 		ns_free_inum(&ns->ns);
diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
index 2499f2485c04..b86a35fbed60 100644
--- a/security/integrity/ima/Makefile
+++ b/security/integrity/ima/Makefile
@@ -7,13 +7,14 @@
 obj-$(CONFIG_IMA) += ima.o
 
 ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
-	 ima_policy.o ima_template.o ima_template_lib.o
+	 ima_policy.o ima_template.o ima_template_lib.o ima_init_ima_ns.o
 ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
 ima-$(CONFIG_IMA_APPRAISE_MODSIG) += ima_modsig.o
 ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
 ima-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
 ima-$(CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS) += ima_asymmetric_keys.o
 ima-$(CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS) += ima_queue_keys.o
+ima-$(CONFIG_IMA_NS) += ima_ns.o
 
 ifeq ($(CONFIG_EFI),y)
 ima-$(CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT) += ima_efi.o
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index be965a8715e4..2f8adf383054 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -418,6 +418,10 @@ static inline void ima_free_modsig(struct modsig *modsig)
 }
 #endif /* CONFIG_IMA_APPRAISE_MODSIG */
 
+int ima_ns_init(void);
+struct ima_namespace;
+int ima_init_namespace(struct ima_namespace *ns);
+
 /* LSM based policy rules require audit */
 #ifdef CONFIG_IMA_LSM_RULES
 
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index b26fa67476b4..f6ae4557a0da 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -120,6 +120,10 @@ int __init ima_init(void)
 {
 	int rc;
 
+	rc = ima_ns_init();
+	if (rc)
+		return rc;
+
 	ima_tpm_chip = tpm_default_chip();
 	if (!ima_tpm_chip)
 		pr_info("No TPM chip found, activating TPM-bypass!\n");
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
new file mode 100644
index 000000000000..12723d77fe17
--- /dev/null
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2016-2018 IBM Corporation
+ * Author:
+ *   Yuqiong Sun <suny@us.ibm.com>
+ *   Stefan Berger <stefanb@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, version 2 of the License.
+ */
+
+#include <linux/export.h>
+#include <linux/user_namespace.h>
+#include <linux/ima.h>
+#include <linux/proc_ns.h>
+
+int ima_init_namespace(struct ima_namespace *ns)
+{
+	return 0;
+}
+
+int __init ima_ns_init(void)
+{
+	return ima_init_namespace(&init_ima_ns);
+}
+
+struct ima_namespace init_ima_ns = {
+	.kref = KREF_INIT(1),
+	.user_ns = &init_user_ns,
+};
+EXPORT_SYMBOL(init_ima_ns);
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
new file mode 100644
index 000000000000..fa8069acc217
--- /dev/null
+++ b/security/integrity/ima/ima_ns.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2016-2018 IBM Corporation
+ * Author:
+ *  Yuqiong Sun <suny@us.ibm.com>
+ *  Stefan Berger <stefanb@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, version 2 of the License.
+ */
+
+#include <linux/kref.h>
+#include <linux/slab.h>
+#include <linux/ima.h>
+#include <linux/mount.h>
+#include <linux/proc_ns.h>
+#include <linux/lsm_hooks.h>
+
+#include "ima.h"
+
+static struct kmem_cache *imans_cachep;
+
+static struct ima_namespace *create_ima_ns(struct user_namespace *user_ns)
+{
+	struct ima_namespace *ns;
+	int err;
+
+	ns = kmem_cache_zalloc(imans_cachep, GFP_KERNEL);
+	if (!ns)
+		return ERR_PTR(-ENOMEM);
+	pr_debug("NEW     ima_ns: 0x%p\n", ns);
+
+	kref_init(&ns->kref);
+	ns->user_ns = user_ns;
+
+	err = ima_init_namespace(ns);
+	if (err)
+		goto fail_free;
+
+	return ns;
+
+fail_free:
+	kmem_cache_free(imans_cachep, ns);
+
+	return ERR_PTR(err);
+}
+
+/**
+ * Copy an ima namespace - create a new one
+ *
+ * @old_ns: old ima namespace to clone
+ * @user_ns: User namespace
+ */
+struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
+				  struct user_namespace *user_ns)
+{
+	return create_ima_ns(user_ns);
+}
+
+static void destroy_ima_ns(struct ima_namespace *ns)
+{
+	pr_debug("DESTROY ima_ns: 0x%p\n", ns);
+	kmem_cache_free(imans_cachep, ns);
+}
+
+void free_ima_ns(struct kref *kref)
+{
+	struct ima_namespace *ns;
+
+	ns = container_of(kref, struct ima_namespace, kref);
+	BUG_ON(ns == &init_ima_ns);
+
+	destroy_ima_ns(ns);
+}
+
+int __init imans_cache_init(void)
+{
+	imans_cachep = KMEM_CACHE(ima_namespace, SLAB_PANIC);
+	return 0;
+}
+subsys_initcall(imans_cache_init)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 02/20] ima: Define ns_status for storing namespaced iint data
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
  2021-11-30 16:06 ` [RFC 01/20] ima: Add IMA namespace support Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 03/20] ima: Namespace audit status flags Stefan Berger
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Mehmet Kayaalp, Stefan Berger

From: Mehmet Kayaalp <mkayaalp@linux.vnet.ibm.com>

This patch adds an rbtree to the IMA namespace structure that stores a
namespaced version of iint->flags in ns_status struct. Similar to the
integrity_iint_cache, both the iint ns_struct are looked up using the
inode pointer value. The lookup, allocate, and insertion code is also
similar, except ns_struct is not free'd when the inode is free'd.
Instead, the lookup verifies the i_ino and i_generation fields are also a
match.

Signed-off-by: Mehmet Kayaalp <mkayaalp@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>

Changelog:
v2:
 * fixed tree traversal in __ima_ns_status_find()
---
 include/linux/ima.h                      |   3 +
 security/integrity/ima/Makefile          |   1 +
 security/integrity/ima/ima.h             |  24 +++++
 security/integrity/ima/ima_init_ima_ns.c |   9 ++
 security/integrity/ima/ima_ns.c          |   1 +
 security/integrity/ima/ima_ns_status.c   | 132 +++++++++++++++++++++++
 6 files changed, 170 insertions(+)
 create mode 100644 security/integrity/ima/ima_ns_status.c

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 86d126b9ff2f..cc0e8c509fa2 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -214,6 +214,9 @@ static inline int ima_inode_removexattr(struct dentry *dentry,
 struct ima_namespace {
 	struct kref kref;
 	struct user_namespace *user_ns;
+	struct rb_root ns_status_tree;
+	rwlock_t ns_status_lock;
+	struct kmem_cache *ns_status_cache;
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
index b86a35fbed60..78c84214e109 100644
--- a/security/integrity/ima/Makefile
+++ b/security/integrity/ima/Makefile
@@ -10,6 +10,7 @@ ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
 	 ima_policy.o ima_template.o ima_template_lib.o ima_init_ima_ns.o
 ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
 ima-$(CONFIG_IMA_APPRAISE_MODSIG) += ima_modsig.o
+ima-$(CONFIG_IMA_NS) += ima_ns.o ima_ns_status.o
 ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
 ima-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
 ima-$(CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS) += ima_asymmetric_keys.o
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 2f8adf383054..28896d256e36 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -133,6 +133,14 @@ static inline void ima_load_kexec_buffer(void) {}
  */
 extern bool ima_canonical_fmt;
 
+struct ns_status {
+	struct rb_node rb_node;
+	struct inode *inode;
+	ino_t i_ino;
+	u32 i_generation;
+	unsigned long flags;
+};
+
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
@@ -422,6 +430,22 @@ int ima_ns_init(void);
 struct ima_namespace;
 int ima_init_namespace(struct ima_namespace *ns);
 
+#ifdef CONFIG_IMA_NS
+struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+				    struct inode *inode);
+
+void free_ns_status_cache(struct ima_namespace *ns);
+
+#else
+
+static inline struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+						  struct inode *inode)
+{
+	return NULL;
+}
+
+#endif /* CONFIG_IMA_NS */
+
 /* LSM based policy rules require audit */
 #ifdef CONFIG_IMA_LSM_RULES
 
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 12723d77fe17..1a44963e8ba9 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -14,9 +14,18 @@
 #include <linux/user_namespace.h>
 #include <linux/ima.h>
 #include <linux/proc_ns.h>
+#include <linux/slab.h>
+
+#include "ima.h"
 
 int ima_init_namespace(struct ima_namespace *ns)
 {
+	ns->ns_status_tree = RB_ROOT;
+	rwlock_init(&ns->ns_status_lock);
+	ns->ns_status_cache = KMEM_CACHE(ns_status, SLAB_PANIC);
+	if (!ns->ns_status_cache)
+		return -ENOMEM;
+
 	return 0;
 }
 
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index fa8069acc217..b10d92959033 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -61,6 +61,7 @@ struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
 static void destroy_ima_ns(struct ima_namespace *ns)
 {
 	pr_debug("DESTROY ima_ns: 0x%p\n", ns);
+	free_ns_status_cache(ns);
 	kmem_cache_free(imans_cachep, ns);
 }
 
diff --git a/security/integrity/ima/ima_ns_status.c b/security/integrity/ima/ima_ns_status.c
new file mode 100644
index 000000000000..b143514c2cb1
--- /dev/null
+++ b/security/integrity/ima/ima_ns_status.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2016-2021 IBM Corporation
+ * Author:
+ *  Yuqiong Sun <suny@us.ibm.com>
+ *  Stefan Berger <stefanb@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, version 2 of the License.
+ */
+
+#include <linux/user_namespace.h>
+#include <linux/proc_ns.h>
+#include <linux/ima.h>
+
+#include "ima.h"
+
+void free_ns_status_cache(struct ima_namespace *ns)
+{
+	struct ns_status *status, *next;
+
+	write_lock(&ns->ns_status_lock);
+	rbtree_postorder_for_each_entry_safe(status, next,
+					     &ns->ns_status_tree, rb_node)
+		kmem_cache_free(ns->ns_status_cache, status);
+	ns->ns_status_tree = RB_ROOT;
+	write_unlock(&ns->ns_status_lock);
+	kmem_cache_destroy(ns->ns_status_cache);
+}
+
+/*
+ * __ima_ns_status_find - return the ns_status associated with an inode
+ */
+static struct ns_status *__ima_ns_status_find(struct ima_namespace *ns,
+					      struct inode *inode)
+{
+	struct ns_status *status;
+	struct rb_node *n = ns->ns_status_tree.rb_node;
+
+	while (n) {
+		status = rb_entry(n, struct ns_status, rb_node);
+
+		if (inode < status->inode)
+			n = n->rb_left;
+		else if (inode > status->inode)
+			n = n->rb_right;
+		else
+			break;
+	}
+	if (!n)
+		return NULL;
+
+	return status;
+}
+
+/*
+ * ima_ns_status_find - return the ns_status associated with an inode
+ */
+static struct ns_status *ima_ns_status_find(struct ima_namespace *ns,
+					    struct inode *inode)
+{
+	struct ns_status *status;
+
+	read_lock(&ns->ns_status_lock);
+	status = __ima_ns_status_find(ns, inode);
+	read_unlock(&ns->ns_status_lock);
+
+	return status;
+}
+
+void insert_ns_status(struct ima_namespace *ns, struct inode *inode,
+		      struct ns_status *status)
+{
+	struct rb_node **p;
+	struct rb_node *node, *parent = NULL;
+	struct ns_status *test_status;
+
+	p = &ns->ns_status_tree.rb_node;
+	while (*p) {
+		parent = *p;
+		test_status = rb_entry(parent, struct ns_status, rb_node);
+		if (inode < test_status->inode)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	node = &status->rb_node;
+	rb_link_node(node, parent, p);
+	rb_insert_color(node, &ns->ns_status_tree);
+}
+
+struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+				    struct inode *inode)
+{
+	struct ns_status *status;
+	int skip_insert = 0;
+
+	status = ima_ns_status_find(ns, inode);
+	if (status) {
+		/*
+		 * Unlike integrity_iint_cache we are not free'ing the
+		 * ns_status data when the inode is free'd. So, in addition to
+		 * checking the inode pointer, we need to make sure the
+		 * (i_generation, i_ino) pair matches as well.
+		 */
+		if (inode->i_ino == status->i_ino &&
+		    inode->i_generation == status->i_generation)
+			return status;
+
+		/* Same inode number is reused, overwrite the ns_status */
+		skip_insert = 1;
+	} else {
+		status = kmem_cache_alloc(ns->ns_status_cache, GFP_NOFS);
+		if (!status)
+			return ERR_PTR(-ENOMEM);
+	}
+
+	write_lock(&ns->ns_status_lock);
+
+	if (!skip_insert)
+		insert_ns_status(ns, inode, status);
+
+	status->inode = inode;
+	status->i_ino = inode->i_ino;
+	status->i_generation = inode->i_generation;
+	status->flags = 0UL;
+
+	write_unlock(&ns->ns_status_lock);
+
+	return status;
+}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 03/20] ima: Namespace audit status flags
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
  2021-11-30 16:06 ` [RFC 01/20] ima: Add IMA namespace support Stefan Berger
  2021-11-30 16:06 ` [RFC 02/20] ima: Define ns_status for storing namespaced iint data Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 04/20] ima: Move delayed work queue and variables into ima_namespace Stefan Berger
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Mehmet Kayaalp

From: Mehmet Kayaalp <mkayaalp@linux.vnet.ibm.com>

The iint cache stores whether the file is measured, appraised, audited
etc. This patch moves the IMA_AUDITED flag into the per-namespace
ns_status, enabling IMA audit mechanism to audit the same file each time
it is accessed in a new namespace.

The ns_status is not looked up if the CONFIG_IMA_NS is disabled or if
any of the IMA_NS_STATUS_ACTIONS (currently only IMA_AUDIT) is not
enabled.

Read and write operations on the iint flags is replaced with function
calls. For reading, iint_flags() returns the bitwise AND of iint->flags
and ns_status->flags. The ns_status flags are masked with
IMA_NS_STATUS_FLAGS (currently only IMA_AUDITED). Similarly
set_iint_flags() only writes the masked portion to the ns_status flags,
while the iint flags is set as before. The ns_status parameter added to
ima_audit_measurement() is used with the above functions to query and
set the ns_status flags.

Signed-off-by: Mehmet Kayaalp <mkayaalp@linux.vnet.ibm.com>

Changelog:
v2:
 * fixed flag calculation in iint_flags()
---
 init/Kconfig                      |  3 +++
 security/integrity/ima/ima.h      | 23 ++++++++++++++++++++++-
 security/integrity/ima/ima_api.c  |  8 +++++---
 security/integrity/ima/ima_main.c | 24 +++++++++++++++++-------
 security/integrity/ima/ima_ns.c   | 20 ++++++++++++++++++++
 5 files changed, 67 insertions(+), 11 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 598baf451a54..d8edd1c03de8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1250,6 +1250,9 @@ config IMA_NS
 	  Allow the creation of IMA namespaces for each user namespace.
 	  Namespaced IMA enables having IMA features work separately
 	  in each IMA namespace.
+	  Currently, only the audit status flags are stored in the namespace,
+	  which allows the same file to be audited each time it is accessed
+	  in a new namespace.
 
 endif # NAMESPACES
 
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 28896d256e36..dd06e16c4e1c 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -282,7 +282,8 @@ int process_buffer_measurement(struct user_namespace *mnt_userns,
 			       int pcr, const char *func_data,
 			       bool buf_hash, u8 *digest, size_t digest_len);
 void ima_audit_measurement(struct integrity_iint_cache *iint,
-			   const unsigned char *filename);
+			   const unsigned char *filename,
+			   struct ns_status *status);
 int ima_alloc_init_template(struct ima_event_data *event_data,
 			    struct ima_template_entry **entry,
 			    struct ima_template_desc *template_desc);
@@ -426,6 +427,9 @@ static inline void ima_free_modsig(struct modsig *modsig)
 }
 #endif /* CONFIG_IMA_APPRAISE_MODSIG */
 
+#define IMA_NS_STATUS_ACTIONS   IMA_AUDIT
+#define IMA_NS_STATUS_FLAGS     IMA_AUDITED
+
 int ima_ns_init(void);
 struct ima_namespace;
 int ima_init_namespace(struct ima_namespace *ns);
@@ -436,6 +440,10 @@ struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
 
 void free_ns_status_cache(struct ima_namespace *ns);
 
+unsigned long iint_flags(struct integrity_iint_cache *iint,
+			 struct ns_status *status);
+unsigned long set_iint_flags(struct integrity_iint_cache *iint,
+			     struct ns_status *status, unsigned long flags);
 #else
 
 static inline struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
@@ -444,6 +452,19 @@ static inline struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
 	return NULL;
 }
 
+static inline unsigned long iint_flags(struct integrity_iint_cache *iint,
+				       struct ns_status *status)
+{
+	return iint->flags;
+}
+
+static inline unsigned long set_iint_flags(struct integrity_iint_cache *iint,
+					   struct ns_status *status,
+					   unsigned long flags)
+{
+	iint->flags = flags;
+	return flags;
+}
 #endif /* CONFIG_IMA_NS */
 
 /* LSM based policy rules require audit */
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index a64fb0130b01..8f7bab17b784 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -342,14 +342,16 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
 }
 
 void ima_audit_measurement(struct integrity_iint_cache *iint,
-			   const unsigned char *filename)
+			   const unsigned char *filename,
+			   struct ns_status *status)
 {
 	struct audit_buffer *ab;
 	char *hash;
 	const char *algo_name = hash_algo_name[iint->ima_hash->algo];
 	int i;
+	unsigned long flags = iint_flags(iint, status);
 
-	if (iint->flags & IMA_AUDITED)
+	if (flags & IMA_AUDITED)
 		return;
 
 	hash = kzalloc((iint->ima_hash->length * 2) + 1, GFP_KERNEL);
@@ -372,7 +374,7 @@ void ima_audit_measurement(struct integrity_iint_cache *iint,
 	audit_log_task_info(ab);
 	audit_log_end(ab);
 
-	iint->flags |= IMA_AUDITED;
+	set_iint_flags(iint, status, flags | IMA_AUDITED);
 out:
 	kfree(hash);
 	return;
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 465865412100..4df60dbb56f7 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -204,6 +204,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 {
 	struct inode *inode = file_inode(file);
 	struct integrity_iint_cache *iint = NULL;
+	struct ns_status *status = NULL;
 	struct ima_template_desc *template_desc = NULL;
 	char *pathbuf = NULL;
 	char filename[NAME_MAX];
@@ -216,6 +217,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	bool violation_check;
 	enum hash_algo hash_algo;
 	unsigned int allowed_algos = 0;
+	unsigned long flags;
 
 	if (!ima_policy_flag || !S_ISREG(inode->i_mode))
 		return 0;
@@ -244,6 +246,12 @@ static int process_measurement(struct file *file, const struct cred *cred,
 		iint = integrity_inode_get(inode);
 		if (!iint)
 			rc = -ENOMEM;
+
+		if (!rc && (action & IMA_NS_STATUS_ACTIONS)) {
+			status = ima_get_ns_status(get_current_ns(), inode);
+			if (IS_ERR(status))
+				rc = PTR_ERR(status);
+		}
 	}
 
 	if (!rc && violation_check)
@@ -259,11 +267,13 @@ static int process_measurement(struct file *file, const struct cred *cred,
 
 	mutex_lock(&iint->mutex);
 
+	flags = iint_flags(iint, status);
+
 	if (test_and_clear_bit(IMA_CHANGE_ATTR, &iint->atomic_flags))
 		/* reset appraisal flags if ima_inode_post_setattr was called */
-		iint->flags &= ~(IMA_APPRAISE | IMA_APPRAISED |
-				 IMA_APPRAISE_SUBMASK | IMA_APPRAISED_SUBMASK |
-				 IMA_ACTION_FLAGS);
+		flags &= ~(IMA_APPRAISE | IMA_APPRAISED |
+			   IMA_APPRAISE_SUBMASK | IMA_APPRAISED_SUBMASK |
+			   IMA_ACTION_FLAGS);
 
 	/*
 	 * Re-evaulate the file if either the xattr has changed or the
@@ -274,7 +284,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	    ((inode->i_sb->s_iflags & SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
 	     !(inode->i_sb->s_iflags & SB_I_UNTRUSTED_MOUNTER) &&
 	     !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
-		iint->flags &= ~IMA_DONE_MASK;
+		flags &= ~IMA_DONE_MASK;
 		iint->measured_pcrs = 0;
 	}
 
@@ -282,9 +292,9 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	 * (IMA_MEASURE, IMA_MEASURED, IMA_XXXX_APPRAISE, IMA_XXXX_APPRAISED,
 	 *  IMA_AUDIT, IMA_AUDITED)
 	 */
-	iint->flags |= action;
+	flags = set_iint_flags(iint, status, flags | action);
 	action &= IMA_DO_MASK;
-	action &= ~((iint->flags & (IMA_DONE_MASK ^ IMA_MEASURED)) >> 1);
+	action &= ~((flags & (IMA_DONE_MASK ^ IMA_MEASURED)) >> 1);
 
 	/* If target pcr is already measured, unset IMA_MEASURE action */
 	if ((action & IMA_MEASURE) && (iint->measured_pcrs & (0x1 << pcr)))
@@ -359,7 +369,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 						  &pathname, filename);
 	}
 	if (action & IMA_AUDIT)
-		ima_audit_measurement(iint, pathname);
+		ima_audit_measurement(iint, pathname, status);
 
 	if ((file->f_flags & O_DIRECT) && (iint->flags & IMA_PERMIT_DIRECTIO))
 		rc = 0;
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index b10d92959033..1c23dc1b8100 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -75,6 +75,26 @@ void free_ima_ns(struct kref *kref)
 	destroy_ima_ns(ns);
 }
 
+unsigned long iint_flags(struct integrity_iint_cache *iint,
+			 struct ns_status *status)
+{
+	if (!status)
+		return iint->flags;
+
+	return (iint->flags & ~IMA_NS_STATUS_FLAGS) |
+	       (status->flags & IMA_NS_STATUS_FLAGS);
+}
+
+unsigned long set_iint_flags(struct integrity_iint_cache *iint,
+			     struct ns_status *status, unsigned long flags)
+{
+	iint->flags = flags;
+	if (status)
+		status->flags = flags & IMA_NS_STATUS_FLAGS;
+
+	return flags;
+}
+
 int __init imans_cache_init(void)
 {
 	imans_cachep = KMEM_CACHE(ima_namespace, SLAB_PANIC);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 04/20] ima: Move delayed work queue and variables into ima_namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (2 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 03/20] ima: Namespace audit status flags Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 05/20] ima: Move IMA's keys queue related " Stefan Berger
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Move the delayed work queue and associated variables to the
ima_namespace and initialize them.

Since keys queued up for measurement currently are only relevant in the
init_ima_ns, call ima_init_key_queue() only when the init_ima_ns is
initialized.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                      | 11 ++++++++
 security/integrity/ima/ima.h             | 12 +++++----
 security/integrity/ima/ima_fs.c          |  3 ++-
 security/integrity/ima/ima_init.c        |  2 --
 security/integrity/ima/ima_init_ima_ns.c |  8 ++++++
 security/integrity/ima/ima_policy.c      |  4 +--
 security/integrity/ima/ima_queue_keys.c  | 34 ++++++++++--------------
 7 files changed, 44 insertions(+), 30 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index cc0e8c509fa2..4b5dada581e4 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -217,6 +217,17 @@ struct ima_namespace {
 	struct rb_root ns_status_tree;
 	rwlock_t ns_status_lock;
 	struct kmem_cache *ns_status_cache;
+
+#ifdef CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS
+	/*
+	 * If custom IMA policy is not loaded then keys queued up
+	 * for measurement should be freed. This worker is used
+	 * for handling this scenario.
+	 */
+	struct delayed_work ima_keys_delayed_work;
+	long ima_key_queue_timeout;
+	bool timer_expired;
+#endif
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index dd06e16c4e1c..9edab9050dc7 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -77,6 +77,8 @@ struct ima_field_data {
 	u32 len;
 };
 
+struct ima_namespace;
+
 /* IMA template field definition */
 struct ima_template_field {
 	const char field_id[IMA_TEMPLATE_FIELD_ID_MAX_LEN];
@@ -247,18 +249,18 @@ struct ima_key_entry {
 	size_t payload_len;
 	char *keyring_name;
 };
-void ima_init_key_queue(void);
+void ima_init_key_queue(struct ima_namespace *ns);
 bool ima_should_queue_key(void);
 bool ima_queue_key(struct key *keyring, const void *payload,
 		   size_t payload_len);
-void ima_process_queued_keys(void);
+void ima_process_queued_keys(struct ima_namespace *ns);
+void ima_keys_handler(struct work_struct *work);
 #else
-static inline void ima_init_key_queue(void) {}
 static inline bool ima_should_queue_key(void) { return false; }
 static inline bool ima_queue_key(struct key *keyring,
 				 const void *payload,
 				 size_t payload_len) { return false; }
-static inline void ima_process_queued_keys(void) {}
+static inline void ima_process_queued_keys(struct ima_namespace *ns) {}
 #endif /* CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS */
 
 /* LIM API function definitions */
@@ -300,7 +302,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
 		     struct ima_template_desc **template_desc,
 		     const char *func_data, unsigned int *allowed_algos);
 void ima_init_policy(void);
-void ima_update_policy(void);
+void ima_update_policy(struct ima_namespace *ns);
 void ima_update_policy_flags(void);
 ssize_t ima_parse_add_rule(char *);
 void ima_delete_rules(void);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 3d8e9d5db5aa..b89cd69df0de 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -21,6 +21,7 @@
 #include <linux/rcupdate.h>
 #include <linux/parser.h>
 #include <linux/vmalloc.h>
+#include <linux/ima.h>
 
 #include "ima.h"
 
@@ -430,7 +431,7 @@ static int ima_release_policy(struct inode *inode, struct file *file)
 		return 0;
 	}
 
-	ima_update_policy();
+	ima_update_policy(get_current_ns());
 #if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
 	securityfs_remove(ima_policy);
 	ima_policy = NULL;
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index f6ae4557a0da..24848373a061 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -155,8 +155,6 @@ int __init ima_init(void)
 	if (rc != 0)
 		return rc;
 
-	ima_init_key_queue();
-
 	ima_measure_critical_data("kernel_info", "kernel_version",
 				  UTS_RELEASE, strlen(UTS_RELEASE), false,
 				  NULL, 0);
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 1a44963e8ba9..3bc2d3611739 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -26,6 +26,14 @@ int ima_init_namespace(struct ima_namespace *ns)
 	if (!ns->ns_status_cache)
 		return -ENOMEM;
 
+#ifdef CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS
+	INIT_DELAYED_WORK(&ns->ima_keys_delayed_work, ima_keys_handler);
+	ns->ima_key_queue_timeout = 300000;
+	ns->timer_expired = false;
+	if (ns == &init_ima_ns)
+		ima_init_key_queue(ns);
+#endif
+
 	return 0;
 }
 
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 320ca80aacab..e5aef287d14d 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -986,7 +986,7 @@ int ima_check_policy(void)
  * Policy rules are never deleted so ima_policy_flag gets zeroed only once when
  * we switch from the default policy to user defined.
  */
-void ima_update_policy(void)
+void ima_update_policy(struct ima_namespace *ns)
 {
 	struct list_head *policy = &ima_policy_rules;
 
@@ -1007,7 +1007,7 @@ void ima_update_policy(void)
 	ima_update_policy_flags();
 
 	/* Custom IMA policy has been loaded */
-	ima_process_queued_keys();
+	ima_process_queued_keys(ns);
 }
 
 /* Keep the enumeration in sync with the policy_tokens! */
diff --git a/security/integrity/ima/ima_queue_keys.c b/security/integrity/ima/ima_queue_keys.c
index 93056c03bf5a..fcaa1645dba3 100644
--- a/security/integrity/ima/ima_queue_keys.c
+++ b/security/integrity/ima/ima_queue_keys.c
@@ -10,6 +10,7 @@
 
 #include <linux/user_namespace.h>
 #include <linux/workqueue.h>
+#include <linux/ima.h>
 #include <keys/asymmetric-type.h>
 #include "ima.h"
 
@@ -25,34 +26,27 @@ static bool ima_process_keys;
 static DEFINE_MUTEX(ima_keys_lock);
 static LIST_HEAD(ima_keys);
 
-/*
- * If custom IMA policy is not loaded then keys queued up
- * for measurement should be freed. This worker is used
- * for handling this scenario.
- */
-static long ima_key_queue_timeout = 300000; /* 5 Minutes */
-static void ima_keys_handler(struct work_struct *work);
-static DECLARE_DELAYED_WORK(ima_keys_delayed_work, ima_keys_handler);
-static bool timer_expired;
-
 /*
  * This worker function frees keys that may still be
  * queued up in case custom IMA policy was not loaded.
  */
-static void ima_keys_handler(struct work_struct *work)
+void ima_keys_handler(struct work_struct *work)
 {
-	timer_expired = true;
-	ima_process_queued_keys();
+	struct ima_namespace *ns;
+
+	ns = container_of(work, struct ima_namespace, ima_keys_delayed_work.work);
+	ns->timer_expired = true;
+	ima_process_queued_keys(ns);
 }
 
 /*
  * This function sets up a worker to free queued keys in case
  * custom IMA policy was never loaded.
  */
-void ima_init_key_queue(void)
+void ima_init_key_queue(struct ima_namespace *ns)
 {
-	schedule_delayed_work(&ima_keys_delayed_work,
-			      msecs_to_jiffies(ima_key_queue_timeout));
+	schedule_delayed_work(&ns->ima_keys_delayed_work,
+			      msecs_to_jiffies(ns->ima_key_queue_timeout));
 }
 
 static void ima_free_key_entry(struct ima_key_entry *entry)
@@ -130,7 +124,7 @@ bool ima_queue_key(struct key *keyring, const void *payload,
  * This function sets ima_process_keys to true and processes queued keys.
  * From here on keys will be processed right away (not queued).
  */
-void ima_process_queued_keys(void)
+void ima_process_queued_keys(struct ima_namespace *ns)
 {
 	struct ima_key_entry *entry, *tmp;
 	bool process = false;
@@ -154,11 +148,11 @@ void ima_process_queued_keys(void)
 	if (!process)
 		return;
 
-	if (!timer_expired)
-		cancel_delayed_work_sync(&ima_keys_delayed_work);
+	if (!ns->timer_expired)
+		cancel_delayed_work_sync(&ns->ima_keys_delayed_work);
 
 	list_for_each_entry_safe(entry, tmp, &ima_keys, list) {
-		if (!timer_expired)
+		if (!ns->timer_expired)
 			process_buffer_measurement(&init_user_ns, NULL,
 						   entry->payload,
 						   entry->payload_len,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 05/20] ima: Move IMA's keys queue related variables into ima_namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (3 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 04/20] ima: Move delayed work queue and variables into ima_namespace Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 06/20] ima: Move policy " Stefan Berger
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Move variables from keys queue into ima_namespace.

Some variables have to be initialized before ima_init() runs, so statically
initialize them for the init_ima_ns.

Since only init_ima_ns uses the queued keys there's no need to free the
list of queued keys when tearing down IMA namespaces.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                          | 11 ++++++
 security/integrity/ima/ima.h                 |  9 ++---
 security/integrity/ima/ima_asymmetric_keys.c |  5 +--
 security/integrity/ima/ima_init_ima_ns.c     |  5 +++
 security/integrity/ima/ima_ns.c              |  6 ++++
 security/integrity/ima/ima_queue_keys.c      | 37 +++++++-------------
 6 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 4b5dada581e4..977df9155cde 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -219,6 +219,17 @@ struct ima_namespace {
 	struct kmem_cache *ns_status_cache;
 
 #ifdef CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS
+	/*
+	 * Flag to indicate whether a key can be processed
+	 * right away or should be queued for processing later.
+	 */
+	bool ima_process_keys;
+
+	/*
+	 * To synchronize access to the list of keys that need to be measured
+	 */
+	struct mutex ima_keys_lock;
+	struct list_head ima_keys;
 	/*
 	 * If custom IMA policy is not loaded then keys queued up
 	 * for measurement should be freed. This worker is used
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 9edab9050dc7..97eb03376855 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -250,14 +250,15 @@ struct ima_key_entry {
 	char *keyring_name;
 };
 void ima_init_key_queue(struct ima_namespace *ns);
-bool ima_should_queue_key(void);
-bool ima_queue_key(struct key *keyring, const void *payload,
+bool ima_should_queue_key(struct ima_namespace *ns);
+bool ima_queue_key(struct ima_namespace *ns, struct key *keyring, const void *payload,
 		   size_t payload_len);
 void ima_process_queued_keys(struct ima_namespace *ns);
 void ima_keys_handler(struct work_struct *work);
 #else
-static inline bool ima_should_queue_key(void) { return false; }
-static inline bool ima_queue_key(struct key *keyring,
+static inline bool ima_should_queue_key(struct ima_namespace *ns) { return false; }
+static inline bool ima_queue_key(struct ima_namespace *ns,
+				 struct key *keyring,
 				 const void *payload,
 				 size_t payload_len) { return false; }
 static inline void ima_process_queued_keys(struct ima_namespace *ns) {}
diff --git a/security/integrity/ima/ima_asymmetric_keys.c b/security/integrity/ima/ima_asymmetric_keys.c
index f6aa0b47a772..b20e82eda8f4 100644
--- a/security/integrity/ima/ima_asymmetric_keys.c
+++ b/security/integrity/ima/ima_asymmetric_keys.c
@@ -30,6 +30,7 @@ void ima_post_key_create_or_update(struct key *keyring, struct key *key,
 				   const void *payload, size_t payload_len,
 				   unsigned long flags, bool create)
 {
+	struct ima_namespace *ns = get_current_ns();
 	bool queued = false;
 
 	/* Only asymmetric keys are handled by this hook. */
@@ -39,8 +40,8 @@ void ima_post_key_create_or_update(struct key *keyring, struct key *key,
 	if (!payload || (payload_len == 0))
 		return;
 
-	if (ima_should_queue_key())
-		queued = ima_queue_key(keyring, payload, payload_len);
+	if (ima_should_queue_key(ns))
+		queued = ima_queue_key(ns, keyring, payload, payload_len);
 
 	if (queued)
 		return;
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 3bc2d3611739..7b66fe598789 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -45,5 +45,10 @@ int __init ima_ns_init(void)
 struct ima_namespace init_ima_ns = {
 	.kref = KREF_INIT(1),
 	.user_ns = &init_user_ns,
+#ifdef CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS
+	.ima_process_keys = false,
+	.ima_keys_lock = __MUTEX_INITIALIZER(init_ima_ns.ima_keys_lock),
+	.ima_keys = LIST_HEAD_INIT(init_ima_ns.ima_keys),
+#endif
 };
 EXPORT_SYMBOL(init_ima_ns);
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index 1c23dc1b8100..4de4a93bc009 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -38,6 +38,12 @@ static struct ima_namespace *create_ima_ns(struct user_namespace *user_ns)
 	if (err)
 		goto fail_free;
 
+#ifdef CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS
+	ns->ima_process_keys = false;
+	mutex_init(&ns->ima_keys_lock);
+	INIT_LIST_HEAD(&ns->ima_keys);
+#endif
+
 	return ns;
 
 fail_free:
diff --git a/security/integrity/ima/ima_queue_keys.c b/security/integrity/ima/ima_queue_keys.c
index fcaa1645dba3..9e5e9a178253 100644
--- a/security/integrity/ima/ima_queue_keys.c
+++ b/security/integrity/ima/ima_queue_keys.c
@@ -14,17 +14,6 @@
 #include <keys/asymmetric-type.h>
 #include "ima.h"
 
-/*
- * Flag to indicate whether a key can be processed
- * right away or should be queued for processing later.
- */
-static bool ima_process_keys;
-
-/*
- * To synchronize access to the list of keys that need to be measured
- */
-static DEFINE_MUTEX(ima_keys_lock);
-static LIST_HEAD(ima_keys);
 
 /*
  * This worker function frees keys that may still be
@@ -95,7 +84,7 @@ static struct ima_key_entry *ima_alloc_key_entry(struct key *keyring,
 	return entry;
 }
 
-bool ima_queue_key(struct key *keyring, const void *payload,
+bool ima_queue_key(struct ima_namespace *ns, struct key *keyring, const void *payload,
 		   size_t payload_len)
 {
 	bool queued = false;
@@ -105,12 +94,12 @@ bool ima_queue_key(struct key *keyring, const void *payload,
 	if (!entry)
 		return false;
 
-	mutex_lock(&ima_keys_lock);
-	if (!ima_process_keys) {
-		list_add_tail(&entry->list, &ima_keys);
+	mutex_lock(&ns->ima_keys_lock);
+	if (!ns->ima_process_keys) {
+		list_add_tail(&entry->list, &ns->ima_keys);
 		queued = true;
 	}
-	mutex_unlock(&ima_keys_lock);
+	mutex_unlock(&ns->ima_keys_lock);
 
 	if (!queued)
 		ima_free_key_entry(entry);
@@ -129,7 +118,7 @@ void ima_process_queued_keys(struct ima_namespace *ns)
 	struct ima_key_entry *entry, *tmp;
 	bool process = false;
 
-	if (ima_process_keys)
+	if (ns->ima_process_keys)
 		return;
 
 	/*
@@ -138,12 +127,12 @@ void ima_process_queued_keys(struct ima_namespace *ns)
 	 * First one setting the ima_process_keys flag to true will
 	 * process the queued keys.
 	 */
-	mutex_lock(&ima_keys_lock);
-	if (!ima_process_keys) {
-		ima_process_keys = true;
+	mutex_lock(&ns->ima_keys_lock);
+	if (!ns->ima_process_keys) {
+		ns->ima_process_keys = true;
 		process = true;
 	}
-	mutex_unlock(&ima_keys_lock);
+	mutex_unlock(&ns->ima_keys_lock);
 
 	if (!process)
 		return;
@@ -151,7 +140,7 @@ void ima_process_queued_keys(struct ima_namespace *ns)
 	if (!ns->timer_expired)
 		cancel_delayed_work_sync(&ns->ima_keys_delayed_work);
 
-	list_for_each_entry_safe(entry, tmp, &ima_keys, list) {
+	list_for_each_entry_safe(entry, tmp, &ns->ima_keys, list) {
 		if (!ns->timer_expired)
 			process_buffer_measurement(&init_user_ns, NULL,
 						   entry->payload,
@@ -165,7 +154,7 @@ void ima_process_queued_keys(struct ima_namespace *ns)
 	}
 }
 
-inline bool ima_should_queue_key(void)
+inline bool ima_should_queue_key(struct ima_namespace *ns)
 {
-	return !ima_process_keys;
+	return !ns->ima_process_keys;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 06/20] ima: Move policy related variables into ima_namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (4 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 05/20] ima: Move IMA's keys queue related " Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 07/20] ima: Move ima_htable " Stefan Berger
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Move variables related to the IMA policy into the ima_namespace. This way
the IMA policy of an IMA namespace can be set and displayed using a
front-end like SecurityFS.

Implement ima_free_policy_rules() that frees the policy rules on
ima_namespace deletion.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                          |  13 +-
 security/integrity/ima/ima.h                 |  35 ++---
 security/integrity/ima/ima_api.c             |   8 +-
 security/integrity/ima/ima_appraise.c        |  26 ++--
 security/integrity/ima/ima_asymmetric_keys.c |   3 +-
 security/integrity/ima/ima_fs.c              |  11 +-
 security/integrity/ima/ima_init.c            |   2 +-
 security/integrity/ima/ima_init_ima_ns.c     |   6 +
 security/integrity/ima/ima_main.c            |  68 +++++-----
 security/integrity/ima/ima_ns.c              |   1 +
 security/integrity/ima/ima_policy.c          | 128 ++++++++++---------
 security/integrity/ima/ima_queue_keys.c      |   2 +-
 12 files changed, 178 insertions(+), 125 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 977df9155cde..e782b00710ad 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -82,7 +82,8 @@ static inline int ima_file_check(struct file *file, int mask)
 	return 0;
 }
 
-static inline void ima_post_create_tmpfile(struct user_namespace *mnt_userns,
+static inline void ima_post_create_tmpfile(struct ima_namespace *ns,
+					   struct user_namespace *mnt_userns,
 					   struct inode *inode)
 {
 }
@@ -239,9 +240,19 @@ struct ima_namespace {
 	long ima_key_queue_timeout;
 	bool timer_expired;
 #endif
+
+	struct list_head ima_default_rules;
+	/* ns's policy rules */
+	struct list_head ima_policy_rules;
+	struct list_head ima_temp_rules;
+	/* Pointer to ns's current policy */
+	struct list_head __rcu *ima_rules;
+	/* current content of the policy */
+	int ima_policy_flag;
 };
 
 extern struct ima_namespace init_ima_ns;
+extern struct list_head ima_default_rules;
 
 #ifdef CONFIG_IMA_NS
 
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 97eb03376855..e295141f2478 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -43,9 +43,6 @@ enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8, TPM_PCR10 = 10 };
 
 #define NR_BANKS(chip) ((chip != NULL) ? chip->nr_allocated_banks : 0)
 
-/* current content of the policy */
-extern int ima_policy_flag;
-
 /* bitset of digests algorithms allowed in the setxattr hook */
 extern atomic_t ima_setxattr_allowed_hash_algorithms;
 
@@ -265,7 +262,8 @@ static inline void ima_process_queued_keys(struct ima_namespace *ns) {}
 #endif /* CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS */
 
 /* LIM API function definitions */
-int ima_get_action(struct user_namespace *mnt_userns, struct inode *inode,
+int ima_get_action(struct ima_namespace *ns,
+		   struct user_namespace *mnt_userns, struct inode *inode,
 		   const struct cred *cred, u32 secid, int mask,
 		   enum ima_hooks func, int *pcr,
 		   struct ima_template_desc **template_desc,
@@ -279,7 +277,8 @@ void ima_store_measurement(struct integrity_iint_cache *iint, struct file *file,
 			   struct evm_ima_xattr_data *xattr_value,
 			   int xattr_len, const struct modsig *modsig, int pcr,
 			   struct ima_template_desc *template_desc);
-int process_buffer_measurement(struct user_namespace *mnt_userns,
+int process_buffer_measurement(struct ima_namespace *ns,
+			       struct user_namespace *mnt_userns,
 			       struct inode *inode, const void *buf, int size,
 			       const char *eventname, enum ima_hooks func,
 			       int pcr, const char *func_data,
@@ -297,17 +296,19 @@ void ima_free_template_entry(struct ima_template_entry *entry);
 const char *ima_d_path(const struct path *path, char **pathbuf, char *filename);
 
 /* IMA policy related functions */
-int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
+int ima_match_policy(struct ima_namespace *ns,
+		     struct user_namespace *mnt_userns, struct inode *inode,
 		     const struct cred *cred, u32 secid, enum ima_hooks func,
 		     int mask, int flags, int *pcr,
 		     struct ima_template_desc **template_desc,
 		     const char *func_data, unsigned int *allowed_algos);
-void ima_init_policy(void);
+void ima_init_policy(struct ima_namespace *ns);
 void ima_update_policy(struct ima_namespace *ns);
-void ima_update_policy_flags(void);
-ssize_t ima_parse_add_rule(char *);
-void ima_delete_rules(void);
-int ima_check_policy(void);
+void ima_update_policy_flags(struct ima_namespace *ns);
+ssize_t ima_parse_add_rule(struct ima_namespace *ns, char *rule);
+void ima_delete_rules(struct ima_namespace *ns);
+int ima_check_policy(struct ima_namespace *ns);
+void ima_free_policy_rules(struct ima_namespace *ns);
 void *ima_policy_start(struct seq_file *m, loff_t *pos);
 void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos);
 void ima_policy_stop(struct seq_file *m, void *v);
@@ -323,14 +324,16 @@ int ima_policy_show(struct seq_file *m, void *v);
 #define IMA_APPRAISE_KEXEC	0x40
 
 #ifdef CONFIG_IMA_APPRAISE
-int ima_check_blacklist(struct integrity_iint_cache *iint,
+int ima_check_blacklist(struct ima_namespace *ns,
+			struct integrity_iint_cache *iint,
 			const struct modsig *modsig, int pcr);
 int ima_appraise_measurement(enum ima_hooks func,
 			     struct integrity_iint_cache *iint,
 			     struct file *file, const unsigned char *filename,
 			     struct evm_ima_xattr_data *xattr_value,
 			     int xattr_len, const struct modsig *modsig);
-int ima_must_appraise(struct user_namespace *mnt_userns, struct inode *inode,
+int ima_must_appraise(struct ima_namespace *ns,
+		      struct user_namespace *mnt_userns, struct inode *inode,
 		      int mask, enum ima_hooks func);
 void ima_update_xattr(struct integrity_iint_cache *iint, struct file *file);
 enum integrity_status ima_get_cache_status(struct integrity_iint_cache *iint,
@@ -341,7 +344,8 @@ int ima_read_xattr(struct dentry *dentry,
 		   struct evm_ima_xattr_data **xattr_value);
 
 #else
-static inline int ima_check_blacklist(struct integrity_iint_cache *iint,
+static inline int ima_check_blacklist(struct ima_namespace *ns,
+				      struct integrity_iint_cache *iint,
 				      const struct modsig *modsig, int pcr)
 {
 	return 0;
@@ -358,7 +362,8 @@ static inline int ima_appraise_measurement(enum ima_hooks func,
 	return INTEGRITY_UNKNOWN;
 }
 
-static inline int ima_must_appraise(struct user_namespace *mnt_userns,
+static inline int ima_must_appraise(struct ima_namespace *ns,
+				    struct user_namespace *mnt_userns,
 				    struct inode *inode, int mask,
 				    enum ima_hooks func)
 {
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 8f7bab17b784..808aec56dbb6 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -14,6 +14,7 @@
 #include <linux/xattr.h>
 #include <linux/evm.h>
 #include <linux/iversion.h>
+#include <linux/ima.h>
 
 #include "ima.h"
 
@@ -185,7 +186,8 @@ void ima_add_violation(struct file *file, const unsigned char *filename,
  * Returns IMA_MEASURE, IMA_APPRAISE mask.
  *
  */
-int ima_get_action(struct user_namespace *mnt_userns, struct inode *inode,
+int ima_get_action(struct ima_namespace *ns,
+		   struct user_namespace *mnt_userns, struct inode *inode,
 		   const struct cred *cred, u32 secid, int mask,
 		   enum ima_hooks func, int *pcr,
 		   struct ima_template_desc **template_desc,
@@ -193,9 +195,9 @@ int ima_get_action(struct user_namespace *mnt_userns, struct inode *inode,
 {
 	int flags = IMA_MEASURE | IMA_AUDIT | IMA_APPRAISE | IMA_HASH;
 
-	flags &= ima_policy_flag;
+	flags &= ns->ima_policy_flag;
 
-	return ima_match_policy(mnt_userns, inode, cred, secid, func, mask,
+	return ima_match_policy(ns, mnt_userns, inode, cred, secid, func, mask,
 				flags, pcr, template_desc, func_data,
 				allowed_algos);
 }
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index dbba51583e7c..b0c1992d8c4b 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -68,7 +68,8 @@ bool is_ima_appraise_enabled(void)
  *
  * Return 1 to appraise or hash
  */
-int ima_must_appraise(struct user_namespace *mnt_userns, struct inode *inode,
+int ima_must_appraise(struct ima_namespace *ns,
+		      struct user_namespace *mnt_userns, struct inode *inode,
 		      int mask, enum ima_hooks func)
 {
 	u32 secid;
@@ -77,7 +78,7 @@ int ima_must_appraise(struct user_namespace *mnt_userns, struct inode *inode,
 		return 0;
 
 	security_task_getsecid_subj(current, &secid);
-	return ima_match_policy(mnt_userns, inode, current_cred(), secid,
+	return ima_match_policy(ns, mnt_userns, inode, current_cred(), secid,
 				func, mask, IMA_APPRAISE | IMA_HASH, NULL,
 				NULL, NULL, NULL);
 }
@@ -341,7 +342,8 @@ static int modsig_verify(enum ima_hooks func, const struct modsig *modsig,
  *
  * Returns -EPERM if the hash is blacklisted.
  */
-int ima_check_blacklist(struct integrity_iint_cache *iint,
+int ima_check_blacklist(struct ima_namespace *ns,
+			struct integrity_iint_cache *iint,
 			const struct modsig *modsig, int pcr)
 {
 	enum hash_algo hash_algo;
@@ -357,7 +359,7 @@ int ima_check_blacklist(struct integrity_iint_cache *iint,
 
 		rc = is_binary_blacklisted(digest, digestsize);
 		if ((rc == -EPERM) && (iint->flags & IMA_MEASURE))
-			process_buffer_measurement(&init_user_ns, NULL, digest, digestsize,
+			process_buffer_measurement(ns, &init_user_ns, NULL, digest, digestsize,
 						   "blacklisted-hash", NONE,
 						   pcr, NULL, false, NULL, 0);
 	}
@@ -527,14 +529,15 @@ void ima_inode_post_setattr(struct user_namespace *mnt_userns,
 			    struct dentry *dentry)
 {
 	struct inode *inode = d_backing_inode(dentry);
+	struct ima_namespace *ns = get_current_ns();
 	struct integrity_iint_cache *iint;
 	int action;
 
-	if (!(ima_policy_flag & IMA_APPRAISE) || !S_ISREG(inode->i_mode)
+	if (!(ns->ima_policy_flag & IMA_APPRAISE) || !S_ISREG(inode->i_mode)
 	    || !(inode->i_opflags & IOP_XATTR))
 		return;
 
-	action = ima_must_appraise(mnt_userns, inode, MAY_ACCESS, POST_SETATTR);
+	action = ima_must_appraise(ns, mnt_userns, inode, MAY_ACCESS, POST_SETATTR);
 	iint = integrity_iint_find(inode);
 	if (iint) {
 		set_bit(IMA_CHANGE_ATTR, &iint->atomic_flags);
@@ -559,11 +562,12 @@ static int ima_protect_xattr(struct dentry *dentry, const char *xattr_name,
 	return 0;
 }
 
-static void ima_reset_appraise_flags(struct inode *inode, int digsig)
+static void ima_reset_appraise_flags(struct ima_namespace *ns,
+				     struct inode *inode, int digsig)
 {
 	struct integrity_iint_cache *iint;
 
-	if (!(ima_policy_flag & IMA_APPRAISE) || !S_ISREG(inode->i_mode))
+	if (!(ns->ima_policy_flag & IMA_APPRAISE) || !S_ISREG(inode->i_mode))
 		return;
 
 	iint = integrity_iint_find(inode);
@@ -641,6 +645,7 @@ int ima_inode_setxattr(struct dentry *dentry, const char *xattr_name,
 		       const void *xattr_value, size_t xattr_value_len)
 {
 	const struct evm_ima_xattr_data *xvalue = xattr_value;
+	struct ima_namespace *ns = get_current_ns();
 	int digsig = 0;
 	int result;
 
@@ -658,18 +663,19 @@ int ima_inode_setxattr(struct dentry *dentry, const char *xattr_name,
 		if (result)
 			return result;
 
-		ima_reset_appraise_flags(d_backing_inode(dentry), digsig);
+		ima_reset_appraise_flags(ns, d_backing_inode(dentry), digsig);
 	}
 	return result;
 }
 
 int ima_inode_removexattr(struct dentry *dentry, const char *xattr_name)
 {
+	struct ima_namespace *ns = get_current_ns();
 	int result;
 
 	result = ima_protect_xattr(dentry, xattr_name, NULL, 0);
 	if (result == 1 || evm_revalidate_status(xattr_name)) {
-		ima_reset_appraise_flags(d_backing_inode(dentry), 0);
+		ima_reset_appraise_flags(ns, d_backing_inode(dentry), 0);
 		if (result == 1)
 			result = 0;
 	}
diff --git a/security/integrity/ima/ima_asymmetric_keys.c b/security/integrity/ima/ima_asymmetric_keys.c
index b20e82eda8f4..b5fe4ed62fec 100644
--- a/security/integrity/ima/ima_asymmetric_keys.c
+++ b/security/integrity/ima/ima_asymmetric_keys.c
@@ -61,7 +61,8 @@ void ima_post_key_create_or_update(struct key *keyring, struct key *key,
 	 * if the IMA policy is configured to measure a key linked
 	 * to the given keyring.
 	 */
-	process_buffer_measurement(&init_user_ns, NULL, payload, payload_len,
+	process_buffer_measurement(get_current_ns(), &init_user_ns, NULL,
+				   payload, payload_len,
 				   keyring->description, KEY_CHECK, 0,
 				   keyring->description, false, NULL, 0);
 }
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index b89cd69df0de..fc0413c8c358 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -297,7 +297,7 @@ static ssize_t ima_read_policy(char *path)
 	datap = data;
 	while (size > 0 && (p = strsep(&datap, "\n"))) {
 		pr_debug("rule: %s\n", p);
-		rc = ima_parse_add_rule(p);
+		rc = ima_parse_add_rule(get_current_ns(), p);
 		if (rc < 0)
 			break;
 		size -= rc;
@@ -345,7 +345,7 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf,
 				    1, 0);
 		result = -EACCES;
 	} else {
-		result = ima_parse_add_rule(data);
+		result = ima_parse_add_rule(get_current_ns(), data);
 	}
 	mutex_unlock(&ima_write_mutex);
 out_free:
@@ -411,11 +411,12 @@ static int ima_open_policy(struct inode *inode, struct file *filp)
 static int ima_release_policy(struct inode *inode, struct file *file)
 {
 	const char *cause = valid_policy ? "completed" : "failed";
+	struct ima_namespace *ns = get_current_ns();
 
 	if ((file->f_flags & O_ACCMODE) == O_RDONLY)
 		return seq_release(inode, file);
 
-	if (valid_policy && ima_check_policy() < 0) {
+	if (valid_policy && ima_check_policy(ns) < 0) {
 		cause = "failed";
 		valid_policy = 0;
 	}
@@ -425,13 +426,13 @@ static int ima_release_policy(struct inode *inode, struct file *file)
 			    "policy_update", cause, !valid_policy, 0);
 
 	if (!valid_policy) {
-		ima_delete_rules();
+		ima_delete_rules(ns);
 		valid_policy = 1;
 		clear_bit(IMA_FS_BUSY, &ima_fs_flags);
 		return 0;
 	}
 
-	ima_update_policy(get_current_ns());
+	ima_update_policy(ns);
 #if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
 	securityfs_remove(ima_policy);
 	ima_policy = NULL;
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index 24848373a061..2ec9a22bbddf 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -149,7 +149,7 @@ int __init ima_init(void)
 	if (rc != 0)
 		return rc;
 
-	ima_init_policy();
+	ima_init_policy(&init_ima_ns);
 
 	rc = ima_fs_init();
 	if (rc != 0)
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 7b66fe598789..2d644791a795 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -34,6 +34,12 @@ int ima_init_namespace(struct ima_namespace *ns)
 		ima_init_key_queue(ns);
 #endif
 
+	INIT_LIST_HEAD(&ns->ima_default_rules);
+	INIT_LIST_HEAD(&ns->ima_policy_rules);
+	INIT_LIST_HEAD(&ns->ima_temp_rules);
+	ns->ima_rules = (struct list_head __rcu *)(&ns->ima_default_rules);
+	ns->ima_policy_flag = 0;
+
 	return 0;
 }
 
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 4df60dbb56f7..9cf1fd7c70bf 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -188,7 +188,7 @@ void ima_file_free(struct file *file)
 	struct inode *inode = file_inode(file);
 	struct integrity_iint_cache *iint;
 
-	if (!ima_policy_flag || !S_ISREG(inode->i_mode))
+	if (!get_current_ns()->ima_policy_flag || !S_ISREG(inode->i_mode))
 		return;
 
 	iint = integrity_iint_find(inode);
@@ -198,7 +198,8 @@ void ima_file_free(struct file *file)
 	ima_check_last_writer(iint, inode, file);
 }
 
-static int process_measurement(struct file *file, const struct cred *cred,
+static int process_measurement(struct ima_namespace *ns,
+			       struct file *file, const struct cred *cred,
 			       u32 secid, char *buf, loff_t size, int mask,
 			       enum ima_hooks func)
 {
@@ -219,18 +220,18 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	unsigned int allowed_algos = 0;
 	unsigned long flags;
 
-	if (!ima_policy_flag || !S_ISREG(inode->i_mode))
+	if (!ns->ima_policy_flag || !S_ISREG(inode->i_mode))
 		return 0;
 
 	/* Return an IMA_MEASURE, IMA_APPRAISE, IMA_AUDIT action
 	 * bitmask based on the appraise/audit/measurement policy.
 	 * Included is the appraise submask.
 	 */
-	action = ima_get_action(file_mnt_user_ns(file), inode, cred, secid,
+	action = ima_get_action(ns, file_mnt_user_ns(file), inode, cred, secid,
 				mask, func, &pcr, &template_desc, NULL,
 				&allowed_algos);
 	violation_check = ((func == FILE_CHECK || func == MMAP_CHECK) &&
-			   (ima_policy_flag & IMA_MEASURE));
+			   (ns->ima_policy_flag & IMA_MEASURE));
 	if (!action && !violation_check)
 		return 0;
 
@@ -248,7 +249,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 			rc = -ENOMEM;
 
 		if (!rc && (action & IMA_NS_STATUS_ACTIONS)) {
-			status = ima_get_ns_status(get_current_ns(), inode);
+			status = ima_get_ns_status(ns, inode);
 			if (IS_ERR(status))
 				rc = PTR_ERR(status);
 		}
@@ -356,7 +357,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 				      xattr_value, xattr_len, modsig, pcr,
 				      template_desc);
 	if (rc == 0 && (action & IMA_APPRAISE_SUBMASK)) {
-		rc = ima_check_blacklist(iint, modsig, pcr);
+		rc = ima_check_blacklist(ns, iint, modsig, pcr);
 		if (rc != -EPERM) {
 			inode_lock(inode);
 			rc = ima_appraise_measurement(func, iint, file,
@@ -419,7 +420,8 @@ int ima_file_mmap(struct file *file, unsigned long prot)
 
 	if (file && (prot & PROT_EXEC)) {
 		security_task_getsecid_subj(current, &secid);
-		return process_measurement(file, current_cred(), secid, NULL,
+		return process_measurement(get_current_ns(),
+					   file, current_cred(), secid, NULL,
 					   0, MAY_EXEC, MMAP_CHECK);
 	}
 
@@ -440,6 +442,7 @@ int ima_file_mmap(struct file *file, unsigned long prot)
  */
 int ima_file_mprotect(struct vm_area_struct *vma, unsigned long prot)
 {
+	struct ima_namespace *ns = get_current_ns();
 	struct ima_template_desc *template = NULL;
 	struct file *file = vma->vm_file;
 	char filename[NAME_MAX];
@@ -452,13 +455,13 @@ int ima_file_mprotect(struct vm_area_struct *vma, unsigned long prot)
 	int pcr;
 
 	/* Is mprotect making an mmap'ed file executable? */
-	if (!(ima_policy_flag & IMA_APPRAISE) || !vma->vm_file ||
+	if (!(ns->ima_policy_flag & IMA_APPRAISE) || !vma->vm_file ||
 	    !(prot & PROT_EXEC) || (vma->vm_flags & VM_EXEC))
 		return 0;
 
 	security_task_getsecid_subj(current, &secid);
 	inode = file_inode(vma->vm_file);
-	action = ima_get_action(file_mnt_user_ns(vma->vm_file), inode,
+	action = ima_get_action(ns, file_mnt_user_ns(vma->vm_file), inode,
 				current_cred(), secid, MAY_EXEC, MMAP_CHECK,
 				&pcr, &template, NULL, NULL);
 
@@ -498,13 +501,13 @@ int ima_bprm_check(struct linux_binprm *bprm)
 	u32 secid;
 
 	security_task_getsecid_subj(current, &secid);
-	ret = process_measurement(bprm->file, current_cred(), secid, NULL, 0,
+	ret = process_measurement(get_current_ns(), bprm->file, current_cred(), secid, NULL, 0,
 				  MAY_EXEC, BPRM_CHECK);
 	if (ret)
 		return ret;
 
 	security_cred_getsecid(bprm->cred, &secid);
-	return process_measurement(bprm->file, bprm->cred, secid, NULL, 0,
+	return process_measurement(get_current_ns(), bprm->file, bprm->cred, secid, NULL, 0,
 				   MAY_EXEC, CREDS_CHECK);
 }
 
@@ -523,18 +526,19 @@ int ima_file_check(struct file *file, int mask)
 	u32 secid;
 
 	security_task_getsecid_subj(current, &secid);
-	return process_measurement(file, current_cred(), secid, NULL, 0,
+	return process_measurement(get_current_ns(), file, current_cred(), secid, NULL, 0,
 				   mask & (MAY_READ | MAY_WRITE | MAY_EXEC |
 					   MAY_APPEND), FILE_CHECK);
 }
 EXPORT_SYMBOL_GPL(ima_file_check);
 
-static int __ima_inode_hash(struct inode *inode, char *buf, size_t buf_size)
+static int __ima_inode_hash(struct ima_namespace *ns,
+			    struct inode *inode, char *buf, size_t buf_size)
 {
 	struct integrity_iint_cache *iint;
 	int hash_algo;
 
-	if (!ima_policy_flag)
+	if (!ns->ima_policy_flag)
 		return -EOPNOTSUPP;
 
 	iint = integrity_iint_find(inode);
@@ -587,7 +591,7 @@ int ima_file_hash(struct file *file, char *buf, size_t buf_size)
 	if (!file)
 		return -EINVAL;
 
-	return __ima_inode_hash(file_inode(file), buf, buf_size);
+	return __ima_inode_hash(get_current_ns(), file_inode(file), buf, buf_size);
 }
 EXPORT_SYMBOL_GPL(ima_file_hash);
 
@@ -614,7 +618,7 @@ int ima_inode_hash(struct inode *inode, char *buf, size_t buf_size)
 	if (!inode)
 		return -EINVAL;
 
-	return __ima_inode_hash(inode, buf, buf_size);
+	return __ima_inode_hash(get_current_ns(), inode, buf, buf_size);
 }
 EXPORT_SYMBOL_GPL(ima_inode_hash);
 
@@ -630,13 +634,14 @@ EXPORT_SYMBOL_GPL(ima_inode_hash);
 void ima_post_create_tmpfile(struct user_namespace *mnt_userns,
 			     struct inode *inode)
 {
+	struct ima_namespace *ns = get_current_ns();
 	struct integrity_iint_cache *iint;
 	int must_appraise;
 
-	if (!ima_policy_flag || !S_ISREG(inode->i_mode))
+	if (!ns->ima_policy_flag || !S_ISREG(inode->i_mode))
 		return;
 
-	must_appraise = ima_must_appraise(mnt_userns, inode, MAY_ACCESS,
+	must_appraise = ima_must_appraise(ns, mnt_userns, inode, MAY_ACCESS,
 					  FILE_CHECK);
 	if (!must_appraise)
 		return;
@@ -662,14 +667,15 @@ void ima_post_create_tmpfile(struct user_namespace *mnt_userns,
 void ima_post_path_mknod(struct user_namespace *mnt_userns,
 			 struct dentry *dentry)
 {
+	struct ima_namespace *ns = get_current_ns();
 	struct integrity_iint_cache *iint;
 	struct inode *inode = dentry->d_inode;
 	int must_appraise;
 
-	if (!ima_policy_flag || !S_ISREG(inode->i_mode))
+	if (!ns->ima_policy_flag || !S_ISREG(inode->i_mode))
 		return;
 
-	must_appraise = ima_must_appraise(mnt_userns, inode, MAY_ACCESS,
+	must_appraise = ima_must_appraise(ns, mnt_userns, inode, MAY_ACCESS,
 					  FILE_CHECK);
 	if (!must_appraise)
 		return;
@@ -720,7 +726,7 @@ int ima_read_file(struct file *file, enum kernel_read_file_id read_id,
 	/* Read entire file for all partial reads. */
 	func = read_idmap[read_id] ?: FILE_CHECK;
 	security_task_getsecid_subj(current, &secid);
-	return process_measurement(file, current_cred(), secid, NULL,
+	return process_measurement(get_current_ns(), file, current_cred(), secid, NULL,
 				   0, MAY_READ, func);
 }
 
@@ -763,7 +769,7 @@ int ima_post_read_file(struct file *file, void *buf, loff_t size,
 
 	func = read_idmap[read_id] ?: FILE_CHECK;
 	security_task_getsecid_subj(current, &secid);
-	return process_measurement(file, current_cred(), secid, buf, size,
+	return process_measurement(get_current_ns(), file, current_cred(), secid, buf, size,
 				   MAY_READ, func);
 }
 
@@ -869,7 +875,8 @@ int ima_post_load_data(char *buf, loff_t size,
  * has been written to the passed location but not added to a measurement entry,
  * a negative value otherwise.
  */
-int process_buffer_measurement(struct user_namespace *mnt_userns,
+int process_buffer_measurement(struct ima_namespace *ns,
+			       struct user_namespace *mnt_userns,
 			       struct inode *inode, const void *buf, int size,
 			       const char *eventname, enum ima_hooks func,
 			       int pcr, const char *func_data,
@@ -897,7 +904,7 @@ int process_buffer_measurement(struct user_namespace *mnt_userns,
 	if (digest && digest_len < digest_hash_len)
 		return -EINVAL;
 
-	if (!ima_policy_flag && !digest)
+	if (!ns->ima_policy_flag && !digest)
 		return -ENOENT;
 
 	template = ima_template_desc_buf();
@@ -916,7 +923,7 @@ int process_buffer_measurement(struct user_namespace *mnt_userns,
 	 */
 	if (func) {
 		security_task_getsecid_subj(current, &secid);
-		action = ima_get_action(mnt_userns, inode, current_cred(),
+		action = ima_get_action(ns, mnt_userns, inode, current_cred(),
 					secid, 0, func, &pcr, &template,
 					func_data, NULL);
 		if (!(action & IMA_MEASURE) && !digest)
@@ -953,7 +960,7 @@ int process_buffer_measurement(struct user_namespace *mnt_userns,
 	if (digest)
 		memcpy(digest, iint.ima_hash->digest, digest_hash_len);
 
-	if (!ima_policy_flag || (func && !(action & IMA_MEASURE)))
+	if (!ns->ima_policy_flag || (func && !(action & IMA_MEASURE)))
 		return 1;
 
 	ret = ima_alloc_init_template(&event_data, &entry, template);
@@ -996,7 +1003,8 @@ void ima_kexec_cmdline(int kernel_fd, const void *buf, int size)
 	if (!f.file)
 		return;
 
-	process_buffer_measurement(file_mnt_user_ns(f.file), file_inode(f.file),
+	process_buffer_measurement(get_current_ns(),
+				   file_mnt_user_ns(f.file), file_inode(f.file),
 				   buf, size, "kexec-cmdline", KEXEC_CMDLINE, 0,
 				   NULL, false, NULL, 0);
 	fdput(f);
@@ -1029,7 +1037,7 @@ int ima_measure_critical_data(const char *event_label,
 	if (!event_name || !event_label || !buf || !buf_len)
 		return -ENOPARAM;
 
-	return process_buffer_measurement(&init_user_ns, NULL, buf, buf_len,
+	return process_buffer_measurement(get_current_ns(), &init_user_ns, NULL, buf, buf_len,
 					  event_name, CRITICAL_DATA, 0,
 					  event_label, hash, digest,
 					  digest_len);
@@ -1062,7 +1070,7 @@ static int __init init_ima(void)
 		pr_warn("Couldn't register LSM notifier, error %d\n", error);
 
 	if (!error)
-		ima_update_policy_flags();
+		ima_update_policy_flags(&init_ima_ns);
 
 	return error;
 }
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index 4de4a93bc009..709db86f285f 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -67,6 +67,7 @@ struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
 static void destroy_ima_ns(struct ima_namespace *ns)
 {
 	pr_debug("DESTROY ima_ns: 0x%p\n", ns);
+	ima_free_policy_rules(ns);
 	free_ns_status_cache(ns);
 	kmem_cache_free(imans_cachep, ns);
 }
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index e5aef287d14d..96e7d63167e8 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -52,7 +52,6 @@
 #define INVALID_PCR(a) (((a) < 0) || \
 	(a) >= (sizeof_field(struct integrity_iint_cache, measured_pcrs) * 8))
 
-int ima_policy_flag;
 static int temp_ima_appraise;
 static int build_ima_appraise __ro_after_init;
 
@@ -233,11 +232,6 @@ static struct ima_rule_entry critical_data_rules[] __ro_after_init = {
 /* An array of architecture specific rules */
 static struct ima_rule_entry *arch_policy_entry __ro_after_init;
 
-static LIST_HEAD(ima_default_rules);
-static LIST_HEAD(ima_policy_rules);
-static LIST_HEAD(ima_temp_rules);
-static struct list_head __rcu *ima_rules = (struct list_head __rcu *)(&ima_default_rules);
-
 static int ima_policy __initdata;
 
 static int __init default_measure_policy_setup(char *str)
@@ -454,12 +448,12 @@ static bool ima_rule_contains_lsm_cond(struct ima_rule_entry *entry)
  * to the old, stale LSM policy.  Update the IMA LSM based rules to reflect
  * the reloaded LSM policy.
  */
-static void ima_lsm_update_rules(void)
+static void ima_lsm_update_rules(struct ima_namespace *ns)
 {
 	struct ima_rule_entry *entry, *e;
 	int result;
 
-	list_for_each_entry_safe(entry, e, &ima_policy_rules, list) {
+	list_for_each_entry_safe(entry, e, &ns->ima_policy_rules, list) {
 		if (!ima_rule_contains_lsm_cond(entry))
 			continue;
 
@@ -477,7 +471,7 @@ int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
 	if (event != LSM_POLICY_CHANGE)
 		return NOTIFY_DONE;
 
-	ima_lsm_update_rules();
+	ima_lsm_update_rules(get_current_ns());
 	return NOTIFY_OK;
 }
 
@@ -688,7 +682,8 @@ static int get_subaction(struct ima_rule_entry *rule, enum ima_hooks func)
  * list when walking it.  Reads are many orders of magnitude more numerous
  * than writes so ima_match_policy() is classical RCU candidate.
  */
-int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
+int ima_match_policy(struct ima_namespace *ns,
+		     struct user_namespace *mnt_userns, struct inode *inode,
 		     const struct cred *cred, u32 secid, enum ima_hooks func,
 		     int mask, int flags, int *pcr,
 		     struct ima_template_desc **template_desc,
@@ -702,7 +697,7 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
 		*template_desc = ima_template_desc_current();
 
 	rcu_read_lock();
-	ima_rules_tmp = rcu_dereference(ima_rules);
+	ima_rules_tmp = rcu_dereference(ns->ima_rules);
 	list_for_each_entry_rcu(entry, ima_rules_tmp, list) {
 
 		if (!(entry->action & actmask))
@@ -760,14 +755,14 @@ int ima_match_policy(struct user_namespace *mnt_userns, struct inode *inode,
  *
  * Context: called after a policy update and at system initialization.
  */
-void ima_update_policy_flags(void)
+void ima_update_policy_flags(struct ima_namespace *ns)
 {
 	struct ima_rule_entry *entry;
 	int new_policy_flag = 0;
 	struct list_head *ima_rules_tmp;
 
 	rcu_read_lock();
-	ima_rules_tmp = rcu_dereference(ima_rules);
+	ima_rules_tmp = rcu_dereference(ns->ima_rules);
 	list_for_each_entry_rcu(entry, ima_rules_tmp, list) {
 		/*
 		 * SETXATTR_CHECK rules do not implement a full policy check
@@ -797,7 +792,7 @@ void ima_update_policy_flags(void)
 	if (!ima_appraise)
 		new_policy_flag &= ~IMA_APPRAISE;
 
-	ima_policy_flag = new_policy_flag;
+	ns->ima_policy_flag = new_policy_flag;
 }
 
 static int ima_appraise_flag(enum ima_hooks func)
@@ -813,7 +808,8 @@ static int ima_appraise_flag(enum ima_hooks func)
 	return 0;
 }
 
-static void add_rules(struct ima_rule_entry *entries, int count,
+static void add_rules(struct ima_namespace *ns,
+		      struct ima_rule_entry *entries, int count,
 		      enum policy_rule_list policy_rule)
 {
 	int i = 0;
@@ -822,7 +818,7 @@ static void add_rules(struct ima_rule_entry *entries, int count,
 		struct ima_rule_entry *entry;
 
 		if (policy_rule & IMA_DEFAULT_POLICY)
-			list_add_tail(&entries[i].list, &ima_default_rules);
+			list_add_tail(&entries[i].list, &ns->ima_default_rules);
 
 		if (policy_rule & IMA_CUSTOM_POLICY) {
 			entry = kmemdup(&entries[i], sizeof(*entry),
@@ -830,7 +826,7 @@ static void add_rules(struct ima_rule_entry *entries, int count,
 			if (!entry)
 				continue;
 
-			list_add_tail(&entry->list, &ima_policy_rules);
+			list_add_tail(&entry->list, &ns->ima_policy_rules);
 		}
 		if (entries[i].action == APPRAISE) {
 			if (entries != build_appraise_rules)
@@ -843,9 +839,10 @@ static void add_rules(struct ima_rule_entry *entries, int count,
 	}
 }
 
-static int ima_parse_rule(char *rule, struct ima_rule_entry *entry);
+static int ima_parse_rule(struct ima_namespace *ns,
+			  char *rule, struct ima_rule_entry *entry);
 
-static int __init ima_init_arch_policy(void)
+static int __init ima_init_arch_policy(struct ima_namespace *ns)
 {
 	const char * const *arch_rules;
 	const char * const *rules;
@@ -873,7 +870,7 @@ static int __init ima_init_arch_policy(void)
 		result = strscpy(rule, *rules, sizeof(rule));
 
 		INIT_LIST_HEAD(&arch_policy_entry[i].list);
-		result = ima_parse_rule(rule, &arch_policy_entry[i]);
+		result = ima_parse_rule(ns, rule, &arch_policy_entry[i]);
 		if (result) {
 			pr_warn("Skipping unknown architecture policy rule: %s\n",
 				rule);
@@ -891,23 +888,23 @@ static int __init ima_init_arch_policy(void)
  *
  * ima_rules points to either the ima_default_rules or the new ima_policy_rules.
  */
-void __init ima_init_policy(void)
+void __init ima_init_policy(struct ima_namespace *ns)
 {
 	int build_appraise_entries, arch_entries;
 
 	/* if !ima_policy, we load NO default rules */
 	if (ima_policy)
-		add_rules(dont_measure_rules, ARRAY_SIZE(dont_measure_rules),
+		add_rules(ns, dont_measure_rules, ARRAY_SIZE(dont_measure_rules),
 			  IMA_DEFAULT_POLICY);
 
 	switch (ima_policy) {
 	case ORIGINAL_TCB:
-		add_rules(original_measurement_rules,
+		add_rules(ns, original_measurement_rules,
 			  ARRAY_SIZE(original_measurement_rules),
 			  IMA_DEFAULT_POLICY);
 		break;
 	case DEFAULT_TCB:
-		add_rules(default_measurement_rules,
+		add_rules(ns, default_measurement_rules,
 			  ARRAY_SIZE(default_measurement_rules),
 			  IMA_DEFAULT_POLICY);
 		break;
@@ -921,11 +918,11 @@ void __init ima_init_policy(void)
 	 * and custom policies, prior to other appraise rules.
 	 * (Highest priority)
 	 */
-	arch_entries = ima_init_arch_policy();
+	arch_entries = ima_init_arch_policy(ns);
 	if (!arch_entries)
 		pr_info("No architecture policies found\n");
 	else
-		add_rules(arch_policy_entry, arch_entries,
+		add_rules(ns, arch_policy_entry, arch_entries,
 			  IMA_DEFAULT_POLICY | IMA_CUSTOM_POLICY);
 
 	/*
@@ -933,7 +930,7 @@ void __init ima_init_policy(void)
 	 * signatures, prior to other appraise rules.
 	 */
 	if (ima_use_secure_boot)
-		add_rules(secure_boot_rules, ARRAY_SIZE(secure_boot_rules),
+		add_rules(ns, secure_boot_rules, ARRAY_SIZE(secure_boot_rules),
 			  IMA_DEFAULT_POLICY);
 
 	/*
@@ -945,32 +942,32 @@ void __init ima_init_policy(void)
 	build_appraise_entries = ARRAY_SIZE(build_appraise_rules);
 	if (build_appraise_entries) {
 		if (ima_use_secure_boot)
-			add_rules(build_appraise_rules, build_appraise_entries,
+			add_rules(ns, build_appraise_rules, build_appraise_entries,
 				  IMA_CUSTOM_POLICY);
 		else
-			add_rules(build_appraise_rules, build_appraise_entries,
+			add_rules(ns, build_appraise_rules, build_appraise_entries,
 				  IMA_DEFAULT_POLICY | IMA_CUSTOM_POLICY);
 	}
 
 	if (ima_use_appraise_tcb)
-		add_rules(default_appraise_rules,
+		add_rules(ns, default_appraise_rules,
 			  ARRAY_SIZE(default_appraise_rules),
 			  IMA_DEFAULT_POLICY);
 
 	if (ima_use_critical_data)
-		add_rules(critical_data_rules,
+		add_rules(ns, critical_data_rules,
 			  ARRAY_SIZE(critical_data_rules),
 			  IMA_DEFAULT_POLICY);
 
 	atomic_set(&ima_setxattr_allowed_hash_algorithms, 0);
 
-	ima_update_policy_flags();
+	ima_update_policy_flags(ns);
 }
 
 /* Make sure we have a valid policy, at least containing some rules. */
-int ima_check_policy(void)
+int ima_check_policy(struct ima_namespace *ns)
 {
-	if (list_empty(&ima_temp_rules))
+	if (list_empty(&ns->ima_temp_rules))
 		return -EINVAL;
 	return 0;
 }
@@ -988,14 +985,14 @@ int ima_check_policy(void)
  */
 void ima_update_policy(struct ima_namespace *ns)
 {
-	struct list_head *policy = &ima_policy_rules;
+	struct list_head *policy = &ns->ima_policy_rules;
 
-	list_splice_tail_init_rcu(&ima_temp_rules, policy, synchronize_rcu);
+	list_splice_tail_init_rcu(&ns->ima_temp_rules, policy, synchronize_rcu);
 
-	if (ima_rules != (struct list_head __rcu *)policy) {
-		ima_policy_flag = 0;
+	if (ns->ima_rules != (struct list_head __rcu *)policy) {
+		ns->ima_policy_flag = 0;
 
-		rcu_assign_pointer(ima_rules, policy);
+		rcu_assign_pointer(ns->ima_rules, policy);
 		/*
 		 * IMA architecture specific policy rules are specified
 		 * as strings and converted to an array of ima_entry_rules
@@ -1004,7 +1001,7 @@ void ima_update_policy(struct ima_namespace *ns)
 		 */
 		kfree(arch_policy_entry);
 	}
-	ima_update_policy_flags();
+	ima_update_policy_flags(ns);
 
 	/* Custom IMA policy has been loaded */
 	ima_process_queued_keys(ns);
@@ -1077,7 +1074,8 @@ static const match_table_t policy_tokens = {
 	{Opt_err, NULL}
 };
 
-static int ima_lsm_rule_init(struct ima_rule_entry *entry,
+static int ima_lsm_rule_init(struct ima_namespace *ns,
+			     struct ima_rule_entry *entry,
 			     substring_t *args, int lsm_rule, int audit_type)
 {
 	int result;
@@ -1097,7 +1095,7 @@ static int ima_lsm_rule_init(struct ima_rule_entry *entry,
 		pr_warn("rule for LSM \'%s\' is undefined\n",
 			entry->lsm[lsm_rule].args_p);
 
-		if (ima_rules == (struct list_head __rcu *)(&ima_default_rules)) {
+		if (ns->ima_rules == (struct list_head __rcu *)(&ns->ima_default_rules)) {
 			kfree(entry->lsm[lsm_rule].args_p);
 			entry->lsm[lsm_rule].args_p = NULL;
 			result = -EINVAL;
@@ -1324,7 +1322,8 @@ static unsigned int ima_parse_appraise_algos(char *arg)
 	return res;
 }
 
-static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
+static int ima_parse_rule(struct ima_namespace *ns,
+			  char *rule, struct ima_rule_entry *entry)
 {
 	struct audit_buffer *ab;
 	char *from;
@@ -1674,37 +1673,37 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
 			break;
 		case Opt_obj_user:
 			ima_log_string(ab, "obj_user", args[0].from);
-			result = ima_lsm_rule_init(entry, args,
+			result = ima_lsm_rule_init(ns, entry, args,
 						   LSM_OBJ_USER,
 						   AUDIT_OBJ_USER);
 			break;
 		case Opt_obj_role:
 			ima_log_string(ab, "obj_role", args[0].from);
-			result = ima_lsm_rule_init(entry, args,
+			result = ima_lsm_rule_init(ns, entry, args,
 						   LSM_OBJ_ROLE,
 						   AUDIT_OBJ_ROLE);
 			break;
 		case Opt_obj_type:
 			ima_log_string(ab, "obj_type", args[0].from);
-			result = ima_lsm_rule_init(entry, args,
+			result = ima_lsm_rule_init(ns, entry, args,
 						   LSM_OBJ_TYPE,
 						   AUDIT_OBJ_TYPE);
 			break;
 		case Opt_subj_user:
 			ima_log_string(ab, "subj_user", args[0].from);
-			result = ima_lsm_rule_init(entry, args,
+			result = ima_lsm_rule_init(ns, entry, args,
 						   LSM_SUBJ_USER,
 						   AUDIT_SUBJ_USER);
 			break;
 		case Opt_subj_role:
 			ima_log_string(ab, "subj_role", args[0].from);
-			result = ima_lsm_rule_init(entry, args,
+			result = ima_lsm_rule_init(ns, entry, args,
 						   LSM_SUBJ_ROLE,
 						   AUDIT_SUBJ_ROLE);
 			break;
 		case Opt_subj_type:
 			ima_log_string(ab, "subj_type", args[0].from);
-			result = ima_lsm_rule_init(entry, args,
+			result = ima_lsm_rule_init(ns, entry, args,
 						   LSM_SUBJ_TYPE,
 						   AUDIT_SUBJ_TYPE);
 			break;
@@ -1810,7 +1809,7 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
  * Avoid locking by allowing just one writer at a time in ima_write_policy()
  * Returns the length of the rule parsed, an error code on failure
  */
-ssize_t ima_parse_add_rule(char *rule)
+ssize_t ima_parse_add_rule(struct ima_namespace *ns, char *rule)
 {
 	static const char op[] = "update_policy";
 	char *p;
@@ -1834,7 +1833,7 @@ ssize_t ima_parse_add_rule(char *rule)
 
 	INIT_LIST_HEAD(&entry->list);
 
-	result = ima_parse_rule(p, entry);
+	result = ima_parse_rule(ns, p, entry);
 	if (result) {
 		ima_free_rule(entry);
 		integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL,
@@ -1843,7 +1842,7 @@ ssize_t ima_parse_add_rule(char *rule)
 		return result;
 	}
 
-	list_add_tail(&entry->list, &ima_temp_rules);
+	list_add_tail(&entry->list, &ns->ima_temp_rules);
 
 	return len;
 }
@@ -1854,12 +1853,24 @@ ssize_t ima_parse_add_rule(char *rule)
  * different from the active one.  There is also only one user of
  * ima_delete_rules() at a time.
  */
-void ima_delete_rules(void)
+void ima_delete_rules(struct ima_namespace *ns)
 {
 	struct ima_rule_entry *entry, *tmp;
 
 	temp_ima_appraise = 0;
-	list_for_each_entry_safe(entry, tmp, &ima_temp_rules, list) {
+	list_for_each_entry_safe(entry, tmp, &ns->ima_temp_rules, list) {
+		list_del(&entry->list);
+		ima_free_rule(entry);
+	}
+}
+
+void ima_free_policy_rules(struct ima_namespace *ns)
+{
+	struct ima_rule_entry *entry, *tmp;
+
+	ima_delete_rules(ns);
+
+	list_for_each_entry_safe(entry, tmp, &ns->ima_policy_rules, list) {
 		list_del(&entry->list);
 		ima_free_rule(entry);
 	}
@@ -1890,7 +1901,7 @@ void *ima_policy_start(struct seq_file *m, loff_t *pos)
 	struct list_head *ima_rules_tmp;
 
 	rcu_read_lock();
-	ima_rules_tmp = rcu_dereference(ima_rules);
+	ima_rules_tmp = rcu_dereference(get_current_ns()->ima_rules);
 	list_for_each_entry_rcu(entry, ima_rules_tmp, list) {
 		if (!l--) {
 			rcu_read_unlock();
@@ -1904,14 +1915,15 @@ void *ima_policy_start(struct seq_file *m, loff_t *pos)
 void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos)
 {
 	struct ima_rule_entry *entry = v;
+	struct ima_namespace *ns = get_current_ns();
 
 	rcu_read_lock();
 	entry = list_entry_rcu(entry->list.next, struct ima_rule_entry, list);
 	rcu_read_unlock();
 	(*pos)++;
 
-	return (&entry->list == &ima_default_rules ||
-		&entry->list == &ima_policy_rules) ? NULL : entry;
+	return (&entry->list == &ns->ima_default_rules ||
+		&entry->list == &ns->ima_policy_rules) ? NULL : entry;
 }
 
 void ima_policy_stop(struct seq_file *m, void *v)
@@ -2177,7 +2189,7 @@ bool ima_appraise_signature(enum kernel_read_file_id id)
 	func = read_idmap[id] ?: FILE_CHECK;
 
 	rcu_read_lock();
-	ima_rules_tmp = rcu_dereference(ima_rules);
+	ima_rules_tmp = rcu_dereference(get_current_ns()->ima_rules);
 	list_for_each_entry_rcu(entry, ima_rules_tmp, list) {
 		if (entry->action != APPRAISE)
 			continue;
diff --git a/security/integrity/ima/ima_queue_keys.c b/security/integrity/ima/ima_queue_keys.c
index 9e5e9a178253..14f334272160 100644
--- a/security/integrity/ima/ima_queue_keys.c
+++ b/security/integrity/ima/ima_queue_keys.c
@@ -142,7 +142,7 @@ void ima_process_queued_keys(struct ima_namespace *ns)
 
 	list_for_each_entry_safe(entry, tmp, &ns->ima_keys, list) {
 		if (!ns->timer_expired)
-			process_buffer_measurement(&init_user_ns, NULL,
+			process_buffer_measurement(ns, &init_user_ns, NULL,
 						   entry->payload,
 						   entry->payload_len,
 						   entry->keyring_name,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 07/20] ima: Move ima_htable into ima_namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (5 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 06/20] ima: Move policy " Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 08/20] ima: Move measurement list related variables " Stefan Berger
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Move ima_htable into ima_namespace. This way the a front-end like
SecurityFS can show the number of violations of an IMA namespace.

Move ima_hash_key() into ima_queue.c since it's only used there.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                      | 11 +++++++
 security/integrity/ima/ima.h             | 34 +++++++------------
 security/integrity/ima/ima_api.c         | 17 ++++++----
 security/integrity/ima/ima_fs.c          |  7 ++--
 security/integrity/ima/ima_init.c        |  6 ++--
 security/integrity/ima/ima_init_ima_ns.c |  4 +++
 security/integrity/ima/ima_main.c        | 13 ++++----
 security/integrity/ima/ima_queue.c       | 42 ++++++++++++++----------
 security/integrity/ima/ima_template.c    |  4 +--
 9 files changed, 78 insertions(+), 60 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index e782b00710ad..96254dfacfa0 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -212,6 +212,15 @@ static inline int ima_inode_removexattr(struct dentry *dentry,
 }
 #endif /* CONFIG_IMA_APPRAISE */
 
+#define IMA_HASH_BITS 10
+#define IMA_MEASURE_HTABLE_SIZE (1 << IMA_HASH_BITS)
+
+struct ima_h_table {
+	atomic_long_t len;	/* number of stored measurements in the list */
+	atomic_long_t violations;
+	struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
+};
+
 struct ima_namespace {
 	struct kref kref;
 	struct user_namespace *user_ns;
@@ -249,6 +258,8 @@ struct ima_namespace {
 	struct list_head __rcu *ima_rules;
 	/* current content of the policy */
 	int ima_policy_flag;
+
+	struct ima_h_table ima_htable;
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index e295141f2478..a7e6c8fb152a 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -32,9 +32,6 @@ enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8, TPM_PCR10 = 10 };
 #define IMA_DIGEST_SIZE		SHA1_DIGEST_SIZE
 #define IMA_EVENT_NAME_LEN_MAX	255
 
-#define IMA_HASH_BITS 10
-#define IMA_MEASURE_HTABLE_SIZE (1 << IMA_HASH_BITS)
-
 #define IMA_TEMPLATE_FIELD_ID_MAX_LEN	16
 #define IMA_TEMPLATE_NUM_FIELDS_MAX	15
 
@@ -143,7 +140,8 @@ struct ns_status {
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
-int ima_add_template_entry(struct ima_template_entry *entry, int violation,
+int ima_add_template_entry(struct ima_namespace *ns,
+			   struct ima_template_entry *entry, int violation,
 			   const char *op, struct inode *inode,
 			   const unsigned char *filename);
 int ima_calc_file_hash(struct file *file, struct ima_digest_data *hash);
@@ -152,7 +150,8 @@ int ima_calc_buffer_hash(const void *buf, loff_t len,
 int ima_calc_field_array_hash(struct ima_field_data *field_data,
 			      struct ima_template_entry *entry);
 int ima_calc_boot_aggregate(struct ima_digest_data *hash);
-void ima_add_violation(struct file *file, const unsigned char *filename,
+void ima_add_violation(struct ima_namespace *ns,
+		       struct file *file, const unsigned char *filename,
 		       struct integrity_iint_cache *iint,
 		       const char *op, const char *cause);
 int ima_init_crypto(void);
@@ -165,8 +164,10 @@ struct ima_template_desc *ima_template_desc_current(void);
 struct ima_template_desc *ima_template_desc_buf(void);
 struct ima_template_desc *lookup_template_desc(const char *name);
 bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
-int ima_restore_measurement_entry(struct ima_template_entry *entry);
-int ima_restore_measurement_list(loff_t bufsize, void *buf);
+int ima_restore_measurement_entry(struct ima_namespace *ns,
+				  struct ima_template_entry *entry);
+int ima_restore_measurement_list(struct ima_namespace *ns,
+				 loff_t bufsize, void *buf);
 int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
@@ -180,19 +181,6 @@ int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
  */
 extern spinlock_t ima_queue_lock;
 
-struct ima_h_table {
-	atomic_long_t len;	/* number of stored measurements in the list */
-	atomic_long_t violations;
-	struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
-};
-extern struct ima_h_table ima_htable;
-
-static inline unsigned int ima_hash_key(u8 *digest)
-{
-	/* there is no point in taking a hash of part of a digest */
-	return (digest[0] | digest[1] << 8) % IMA_MEASURE_HTABLE_SIZE;
-}
-
 #define __ima_hooks(hook)				\
 	hook(NONE, none)				\
 	hook(FILE_CHECK, file)				\
@@ -272,7 +260,8 @@ int ima_must_measure(struct inode *inode, int mask, enum ima_hooks func);
 int ima_collect_measurement(struct integrity_iint_cache *iint,
 			    struct file *file, void *buf, loff_t size,
 			    enum hash_algo algo, struct modsig *modsig);
-void ima_store_measurement(struct integrity_iint_cache *iint, struct file *file,
+void ima_store_measurement(struct ima_namespace *ns,
+			   struct integrity_iint_cache *iint, struct file *file,
 			   const unsigned char *filename,
 			   struct evm_ima_xattr_data *xattr_value,
 			   int xattr_len, const struct modsig *modsig, int pcr,
@@ -289,7 +278,8 @@ void ima_audit_measurement(struct integrity_iint_cache *iint,
 int ima_alloc_init_template(struct ima_event_data *event_data,
 			    struct ima_template_entry **entry,
 			    struct ima_template_desc *template_desc);
-int ima_store_template(struct ima_template_entry *entry, int violation,
+int ima_store_template(struct ima_namespace *ns,
+		       struct ima_template_entry *entry, int violation,
 		       struct inode *inode,
 		       const unsigned char *filename, int pcr);
 void ima_free_template_entry(struct ima_template_entry *entry);
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 808aec56dbb6..71c5517fe8bc 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -100,7 +100,8 @@ int ima_alloc_init_template(struct ima_event_data *event_data,
  *
  * Returns 0 on success, error code otherwise
  */
-int ima_store_template(struct ima_template_entry *entry,
+int ima_store_template(struct ima_namespace *ns,
+		       struct ima_template_entry *entry,
 		       int violation, struct inode *inode,
 		       const unsigned char *filename, int pcr)
 {
@@ -120,7 +121,7 @@ int ima_store_template(struct ima_template_entry *entry,
 		}
 	}
 	entry->pcr = pcr;
-	result = ima_add_template_entry(entry, violation, op, inode, filename);
+	result = ima_add_template_entry(ns, entry, violation, op, inode, filename);
 	return result;
 }
 
@@ -131,7 +132,8 @@ int ima_store_template(struct ima_template_entry *entry,
  * By extending the PCR with 0xFF's instead of with zeroes, the PCR
  * value is invalidated.
  */
-void ima_add_violation(struct file *file, const unsigned char *filename,
+void ima_add_violation(struct ima_namespace *ns,
+		       struct file *file, const unsigned char *filename,
 		       struct integrity_iint_cache *iint,
 		       const char *op, const char *cause)
 {
@@ -145,14 +147,14 @@ void ima_add_violation(struct file *file, const unsigned char *filename,
 	int result;
 
 	/* can overflow, only indicator */
-	atomic_long_inc(&ima_htable.violations);
+	atomic_long_inc(&ns->ima_htable.violations);
 
 	result = ima_alloc_init_template(&event_data, &entry, NULL);
 	if (result < 0) {
 		result = -ENOMEM;
 		goto err_out;
 	}
-	result = ima_store_template(entry, violation, inode,
+	result = ima_store_template(ns, entry, violation, inode,
 				    filename, CONFIG_IMA_MEASURE_PCR_IDX);
 	if (result < 0)
 		ima_free_template_entry(entry);
@@ -299,7 +301,8 @@ int ima_collect_measurement(struct integrity_iint_cache *iint,
  *
  * Must be called with iint->mutex held.
  */
-void ima_store_measurement(struct integrity_iint_cache *iint,
+void ima_store_measurement(struct ima_namespace *ns,
+			   struct integrity_iint_cache *iint,
 			   struct file *file, const unsigned char *filename,
 			   struct evm_ima_xattr_data *xattr_value,
 			   int xattr_len, const struct modsig *modsig, int pcr,
@@ -334,7 +337,7 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
 		return;
 	}
 
-	result = ima_store_template(entry, violation, inode, filename, pcr);
+	result = ima_store_template(ns, entry, violation, inode, filename, pcr);
 	if ((!result || result == -EEXIST) && !(file->f_flags & O_DIRECT)) {
 		iint->flags |= IMA_MEASURED;
 		iint->measured_pcrs |= (0x1 << pcr);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index fc0413c8c358..9df8648ad64d 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -53,7 +53,9 @@ static ssize_t ima_show_htable_violations(struct file *filp,
 					  char __user *buf,
 					  size_t count, loff_t *ppos)
 {
-	return ima_show_htable_value(buf, count, ppos, &ima_htable.violations);
+	struct ima_namespace *ns = get_current_ns();
+
+	return ima_show_htable_value(buf, count, ppos, &ns->ima_htable.violations);
 }
 
 static const struct file_operations ima_htable_violations_ops = {
@@ -65,8 +67,9 @@ static ssize_t ima_show_measurements_count(struct file *filp,
 					   char __user *buf,
 					   size_t count, loff_t *ppos)
 {
-	return ima_show_htable_value(buf, count, ppos, &ima_htable.len);
+	struct ima_namespace *ns = get_current_ns();
 
+	return ima_show_htable_value(buf, count, ppos, &ns->ima_htable.len);
 }
 
 static const struct file_operations ima_measurements_count_ops = {
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index 2ec9a22bbddf..6104d5116a7f 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -39,7 +39,7 @@ struct tpm_chip *ima_tpm_chip;
  * a different value.) Violations add a zero entry to the measurement
  * list and extend the aggregate PCR value with ff...ff's.
  */
-static int __init ima_add_boot_aggregate(void)
+static int __init ima_add_boot_aggregate(struct ima_namespace *ns)
 {
 	static const char op[] = "add_boot_aggregate";
 	const char *audit_cause = "ENOMEM";
@@ -86,7 +86,7 @@ static int __init ima_add_boot_aggregate(void)
 		goto err_out;
 	}
 
-	result = ima_store_template(entry, violation, NULL,
+	result = ima_store_template(ns, entry, violation, NULL,
 				    boot_aggregate_name,
 				    CONFIG_IMA_MEASURE_PCR_IDX);
 	if (result < 0) {
@@ -145,7 +145,7 @@ int __init ima_init(void)
 	rc = ima_init_digests();
 	if (rc != 0)
 		return rc;
-	rc = ima_add_boot_aggregate();	/* boot aggregate must be first entry */
+	rc = ima_add_boot_aggregate(&init_ima_ns);	/* boot aggregate must be first entry */
 	if (rc != 0)
 		return rc;
 
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 2d644791a795..e13adc3287ed 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -40,6 +40,10 @@ int ima_init_namespace(struct ima_namespace *ns)
 	ns->ima_rules = (struct list_head __rcu *)(&ns->ima_default_rules);
 	ns->ima_policy_flag = 0;
 
+	atomic_long_set(&ns->ima_htable.len, 0);
+	atomic_long_set(&ns->ima_htable.violations, 0);
+	memset(&ns->ima_htable.queue, 0, sizeof(ns->ima_htable.queue));
+
 	return 0;
 }
 
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 9cf1fd7c70bf..d692c9d53a98 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -112,7 +112,8 @@ static int mmap_violation_check(enum ima_hooks func, struct file *file,
  *	  could result in a file measurement error.
  *
  */
-static void ima_rdwr_violation_check(struct file *file,
+static void ima_rdwr_violation_check(struct ima_namespace *ns,
+				     struct file *file,
 				     struct integrity_iint_cache *iint,
 				     int must_measure,
 				     char **pathbuf,
@@ -145,10 +146,10 @@ static void ima_rdwr_violation_check(struct file *file,
 	*pathname = ima_d_path(&file->f_path, pathbuf, filename);
 
 	if (send_tomtou)
-		ima_add_violation(file, *pathname, iint,
+		ima_add_violation(ns, file, *pathname, iint,
 				  "invalid_pcr", "ToMToU");
 	if (send_writers)
-		ima_add_violation(file, *pathname, iint,
+		ima_add_violation(ns, file, *pathname, iint,
 				  "invalid_pcr", "open_writers");
 }
 
@@ -256,7 +257,7 @@ static int process_measurement(struct ima_namespace *ns,
 	}
 
 	if (!rc && violation_check)
-		ima_rdwr_violation_check(file, iint, action & IMA_MEASURE,
+		ima_rdwr_violation_check(ns, file, iint, action & IMA_MEASURE,
 					 &pathbuf, &pathname, filename);
 
 	inode_unlock(inode);
@@ -353,7 +354,7 @@ static int process_measurement(struct ima_namespace *ns,
 		pathname = ima_d_path(&file->f_path, &pathbuf, filename);
 
 	if (action & IMA_MEASURE)
-		ima_store_measurement(iint, file, pathname,
+		ima_store_measurement(ns, iint, file, pathname,
 				      xattr_value, xattr_len, modsig, pcr,
 				      template_desc);
 	if (rc == 0 && (action & IMA_APPRAISE_SUBMASK)) {
@@ -969,7 +970,7 @@ int process_buffer_measurement(struct ima_namespace *ns,
 		goto out;
 	}
 
-	ret = ima_store_template(entry, violation, NULL, event_data.buf, pcr);
+	ret = ima_store_template(ns, entry, violation, NULL, event_data.buf, pcr);
 	if (ret < 0) {
 		audit_cause = "store_entry";
 		ima_free_template_entry(entry);
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index 532da87ce519..373154039b91 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -17,6 +17,7 @@
 
 #include <linux/rculist.h>
 #include <linux/slab.h>
+#include <linux/ima.h>
 #include "ima.h"
 
 #define AUDIT_CAUSE_LEN_MAX 32
@@ -31,21 +32,22 @@ static unsigned long binary_runtime_size;
 static unsigned long binary_runtime_size = ULONG_MAX;
 #endif
 
-/* key: inode (before secure-hashing a file) */
-struct ima_h_table ima_htable = {
-	.len = ATOMIC_LONG_INIT(0),
-	.violations = ATOMIC_LONG_INIT(0),
-	.queue[0 ... IMA_MEASURE_HTABLE_SIZE - 1] = HLIST_HEAD_INIT
-};
-
 /* mutex protects atomicity of extending measurement list
  * and extending the TPM PCR aggregate. Since tpm_extend can take
  * long (and the tpm driver uses a mutex), we can't use the spinlock.
  */
 static DEFINE_MUTEX(ima_extend_list_mutex);
 
+
+static inline unsigned int ima_hash_key(u8 *digest)
+{
+	/* there is no point in taking a hash of part of a digest */
+	return (digest[0] | digest[1] << 8) % IMA_MEASURE_HTABLE_SIZE;
+}
+
 /* lookup up the digest value in the hash table, and return the entry */
-static struct ima_queue_entry *ima_lookup_digest_entry(u8 *digest_value,
+static struct ima_queue_entry *ima_lookup_digest_entry(struct ima_namespace *ns,
+						       u8 *digest_value,
 						       int pcr)
 {
 	struct ima_queue_entry *qe, *ret = NULL;
@@ -54,7 +56,7 @@ static struct ima_queue_entry *ima_lookup_digest_entry(u8 *digest_value,
 
 	key = ima_hash_key(digest_value);
 	rcu_read_lock();
-	hlist_for_each_entry_rcu(qe, &ima_htable.queue[key], hnext) {
+	hlist_for_each_entry_rcu(qe, &ns->ima_htable.queue[key], hnext) {
 		rc = memcmp(qe->entry->digests[ima_hash_algo_idx].digest,
 			    digest_value, hash_digest_size[ima_hash_algo]);
 		if ((rc == 0) && (qe->entry->pcr == pcr)) {
@@ -90,7 +92,8 @@ static int get_binary_runtime_size(struct ima_template_entry *entry)
  *
  * (Called with ima_extend_list_mutex held.)
  */
-static int ima_add_digest_entry(struct ima_template_entry *entry,
+static int ima_add_digest_entry(struct ima_namespace *ns,
+				struct ima_template_entry *entry,
 				bool update_htable)
 {
 	struct ima_queue_entry *qe;
@@ -106,11 +109,12 @@ static int ima_add_digest_entry(struct ima_template_entry *entry,
 	INIT_LIST_HEAD(&qe->later);
 	list_add_tail_rcu(&qe->later, &ima_measurements);
 
-	atomic_long_inc(&ima_htable.len);
+	atomic_long_inc(&ns->ima_htable.len);
 	if (update_htable) {
 		key = ima_hash_key(entry->digests[ima_hash_algo_idx].digest);
-		hlist_add_head_rcu(&qe->hnext, &ima_htable.queue[key]);
-	}
+		hlist_add_head_rcu(&qe->hnext, &ns->ima_htable.queue[key]);
+	} else
+		INIT_HLIST_NODE(&qe->hnext);
 
 	if (binary_runtime_size != ULONG_MAX) {
 		int size;
@@ -156,7 +160,8 @@ static int ima_pcr_extend(struct tpm_digest *digests_arg, int pcr)
  * kexec, maintain the total memory size required for serializing the
  * binary_runtime_measurements.
  */
-int ima_add_template_entry(struct ima_template_entry *entry, int violation,
+int ima_add_template_entry(struct ima_namespace *ns,
+			   struct ima_template_entry *entry, int violation,
 			   const char *op, struct inode *inode,
 			   const unsigned char *filename)
 {
@@ -169,14 +174,14 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation,
 
 	mutex_lock(&ima_extend_list_mutex);
 	if (!violation && !IS_ENABLED(CONFIG_IMA_DISABLE_HTABLE)) {
-		if (ima_lookup_digest_entry(digest, entry->pcr)) {
+		if (ima_lookup_digest_entry(ns, digest, entry->pcr)) {
 			audit_cause = "hash_exists";
 			result = -EEXIST;
 			goto out;
 		}
 	}
 
-	result = ima_add_digest_entry(entry,
+	result = ima_add_digest_entry(ns, entry,
 				      !IS_ENABLED(CONFIG_IMA_DISABLE_HTABLE));
 	if (result < 0) {
 		audit_cause = "ENOMEM";
@@ -201,12 +206,13 @@ int ima_add_template_entry(struct ima_template_entry *entry, int violation,
 	return result;
 }
 
-int ima_restore_measurement_entry(struct ima_template_entry *entry)
+int ima_restore_measurement_entry(struct ima_namespace *ns,
+				  struct ima_template_entry *entry)
 {
 	int result = 0;
 
 	mutex_lock(&ima_extend_list_mutex);
-	result = ima_add_digest_entry(entry, 0);
+	result = ima_add_digest_entry(ns, entry, 0);
 	mutex_unlock(&ima_extend_list_mutex);
 	return result;
 }
diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c
index 694560396be0..2ae87eb23a59 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -400,7 +400,7 @@ static int ima_restore_template_data(struct ima_template_desc *template_desc,
 }
 
 /* Restore the serialized binary measurement list without extending PCRs. */
-int ima_restore_measurement_list(loff_t size, void *buf)
+int ima_restore_measurement_list(struct ima_namespace *ns, loff_t size, void *buf)
 {
 	char template_name[MAX_TEMPLATE_NAME_LEN];
 	unsigned char zero[TPM_DIGEST_SIZE] = { 0 };
@@ -516,7 +516,7 @@ int ima_restore_measurement_list(loff_t size, void *buf)
 
 		entry->pcr = !ima_canonical_fmt ? *(u32 *)(hdr[HDR_PCR].data) :
 			     le32_to_cpu(*(__le32 *)(hdr[HDR_PCR].data));
-		ret = ima_restore_measurement_entry(entry);
+		ret = ima_restore_measurement_entry(ns, entry);
 		if (ret < 0)
 			break;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (6 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 07/20] ima: Move ima_htable " Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-12-02 12:46   ` James Bottomley
  2021-11-30 16:06 ` [RFC 09/20] ima: Only accept AUDIT rules for IMA non-init_ima_ns namespaces for now Stefan Berger
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Move measurement list related variables into the ima_namespace. This way a
front-end like SecurityFS can show the measurement list inside an IMA
namespace.

Implement ima_free_measurements() to free a list of measurements
and call it when an IMA namespace is deleted.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                      |  2 ++
 security/integrity/ima/ima.h             |  4 +--
 security/integrity/ima/ima_fs.c          |  6 +++--
 security/integrity/ima/ima_init_ima_ns.c |  5 ++++
 security/integrity/ima/ima_ns.c          |  1 +
 security/integrity/ima/ima_queue.c       | 33 ++++++++++++++----------
 6 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 96254dfacfa0..850a513834d2 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -260,6 +260,8 @@ struct ima_namespace {
 	int ima_policy_flag;
 
 	struct ima_h_table ima_htable;
+	struct list_head ima_measurements;
+	unsigned long binary_runtime_size;
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index a7e6c8fb152a..bb9763cd5fb1 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -104,7 +104,6 @@ struct ima_queue_entry {
 	struct list_head later;		/* place in ima_measurements list */
 	struct ima_template_entry *entry;
 };
-extern struct list_head ima_measurements;	/* list of all measurements */
 
 /* Some details preceding the binary serialized measurement list */
 struct ima_kexec_hdr {
@@ -168,8 +167,9 @@ int ima_restore_measurement_entry(struct ima_namespace *ns,
 				  struct ima_template_entry *entry);
 int ima_restore_measurement_list(struct ima_namespace *ns,
 				 loff_t bufsize, void *buf);
+void ima_free_measurements(struct ima_namespace *ns);
 int ima_measurements_show(struct seq_file *m, void *v);
-unsigned long ima_get_binary_runtime_size(void);
+unsigned long ima_get_binary_runtime_size(struct ima_namespace *ns);
 int ima_init_template(void);
 void ima_init_template_list(void);
 int __init ima_init_digests(void);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 9df8648ad64d..c35e15fb313f 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -80,12 +80,13 @@ static const struct file_operations ima_measurements_count_ops = {
 /* returns pointer to hlist_node */
 static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
 {
+	struct ima_namespace *ns = get_current_ns();
 	loff_t l = *pos;
 	struct ima_queue_entry *qe;
 
 	/* we need a lock since pos could point beyond last element */
 	rcu_read_lock();
-	list_for_each_entry_rcu(qe, &ima_measurements, later) {
+	list_for_each_entry_rcu(qe, &ns->ima_measurements, later) {
 		if (!l--) {
 			rcu_read_unlock();
 			return qe;
@@ -97,6 +98,7 @@ static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
 
 static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
 {
+	struct ima_namespace *ns = get_current_ns();
 	struct ima_queue_entry *qe = v;
 
 	/* lock protects when reading beyond last element
@@ -107,7 +109,7 @@ static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
 	rcu_read_unlock();
 	(*pos)++;
 
-	return (&qe->later == &ima_measurements) ? NULL : qe;
+	return (&qe->later == &ns->ima_measurements) ? NULL : qe;
 }
 
 static void ima_measurements_stop(struct seq_file *m, void *v)
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index e13adc3287ed..57e46a10c001 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -43,6 +43,11 @@ int ima_init_namespace(struct ima_namespace *ns)
 	atomic_long_set(&ns->ima_htable.len, 0);
 	atomic_long_set(&ns->ima_htable.violations, 0);
 	memset(&ns->ima_htable.queue, 0, sizeof(ns->ima_htable.queue));
+	INIT_LIST_HEAD(&ns->ima_measurements);
+	if (IS_ENABLED(CONFIG_IMA_KEXEC) && ns == &init_ima_ns)
+		ns->binary_runtime_size = 0;
+	else
+		ns->binary_runtime_size = ULONG_MAX;
 
 	return 0;
 }
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index 709db86f285f..e4f4cf84a6b5 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -69,6 +69,7 @@ static void destroy_ima_ns(struct ima_namespace *ns)
 	pr_debug("DESTROY ima_ns: 0x%p\n", ns);
 	ima_free_policy_rules(ns);
 	free_ns_status_cache(ns);
+	ima_free_measurements(ns);
 	kmem_cache_free(imans_cachep, ns);
 }
 
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index 373154039b91..f15f776918ec 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -25,13 +25,6 @@
 /* pre-allocated array of tpm_digest structures to extend a PCR */
 static struct tpm_digest *digests;
 
-LIST_HEAD(ima_measurements);	/* list of all measurements */
-#ifdef CONFIG_IMA_KEXEC
-static unsigned long binary_runtime_size;
-#else
-static unsigned long binary_runtime_size = ULONG_MAX;
-#endif
-
 /* mutex protects atomicity of extending measurement list
  * and extending the TPM PCR aggregate. Since tpm_extend can take
  * long (and the tpm driver uses a mutex), we can't use the spinlock.
@@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct ima_namespace *ns,
 	qe->entry = entry;
 
 	INIT_LIST_HEAD(&qe->later);
-	list_add_tail_rcu(&qe->later, &ima_measurements);
+	list_add_tail_rcu(&qe->later, &ns->ima_measurements);
 
 	atomic_long_inc(&ns->ima_htable.len);
 	if (update_htable) {
@@ -116,12 +109,12 @@ static int ima_add_digest_entry(struct ima_namespace *ns,
 	} else
 		INIT_HLIST_NODE(&qe->hnext);
 
-	if (binary_runtime_size != ULONG_MAX) {
+	if (ns->binary_runtime_size != ULONG_MAX) {
 		int size;
 
 		size = get_binary_runtime_size(entry);
-		binary_runtime_size = (binary_runtime_size < ULONG_MAX - size) ?
-		     binary_runtime_size + size : ULONG_MAX;
+		ns->binary_runtime_size = (ns->binary_runtime_size < ULONG_MAX - size) ?
+		     ns->binary_runtime_size + size : ULONG_MAX;
 	}
 	return 0;
 }
@@ -131,12 +124,12 @@ static int ima_add_digest_entry(struct ima_namespace *ns,
  * entire binary_runtime_measurement list, including the ima_kexec_hdr
  * structure.
  */
-unsigned long ima_get_binary_runtime_size(void)
+unsigned long ima_get_binary_runtime_size(struct ima_namespace *ns)
 {
-	if (binary_runtime_size >= (ULONG_MAX - sizeof(struct ima_kexec_hdr)))
+	if (ns->binary_runtime_size >= (ULONG_MAX - sizeof(struct ima_kexec_hdr)))
 		return ULONG_MAX;
 	else
-		return binary_runtime_size + sizeof(struct ima_kexec_hdr);
+		return ns->binary_runtime_size + sizeof(struct ima_kexec_hdr);
 }
 
 static int ima_pcr_extend(struct tpm_digest *digests_arg, int pcr)
@@ -217,6 +210,18 @@ int ima_restore_measurement_entry(struct ima_namespace *ns,
 	return result;
 }
 
+void ima_free_measurements(struct ima_namespace *ns)
+{
+	struct ima_queue_entry *qe, *tmp;
+
+	list_for_each_entry_safe(qe, tmp, &ns->ima_measurements, later) {
+		list_del(&qe->later);
+		if (!hlist_unhashed(&qe->hnext))
+			hlist_del(&qe->hnext);
+		kfree(qe);
+	}
+}
+
 int __init ima_init_digests(void)
 {
 	u16 digest_size;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 09/20] ima: Only accept AUDIT rules for IMA non-init_ima_ns namespaces for now
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (7 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 08/20] ima: Move measurement list related variables " Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 10/20] ima: Implement hierarchical processing of file accesses Stefan Berger
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Only accept AUDIT rules for non-init_ima_ns namespaces rejecting all rules
that require support for measuring, appraisal, and hashing.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 security/integrity/ima/ima_policy.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 96e7d63167e8..02e96da2faff 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -1785,6 +1785,16 @@ static int ima_parse_rule(struct ima_namespace *ns,
 			result = -EINVAL;
 			break;
 		}
+
+		/* IMA namespace only accepts AUDIT rules */
+		if (ns != &init_ima_ns) {
+			switch (entry->action) {
+			case MEASURE:
+			case APPRAISE:
+			case HASH:
+				result = -EINVAL;
+			}
+		}
 	}
 	if (!result && !ima_validate_rule(entry))
 		result = -EINVAL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 10/20] ima: Implement hierarchical processing of file accesses
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (8 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 09/20] ima: Only accept AUDIT rules for IMA non-init_ima_ns namespaces for now Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 11/20] securityfs: Prefix global variables with securityfs_ Stefan Berger
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Implement hierarchical processing of file accesses in IMA namespaces by
walking the list of IMA namespaces towards the init_ima_ns. This way
file accesses can be audited in an IMA namespace and also be evaluated
against the IMA policies of parent IMA namespaces.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 security/integrity/ima/ima_main.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index d692c9d53a98..42cbcaf2dc1e 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -199,10 +199,10 @@ void ima_file_free(struct file *file)
 	ima_check_last_writer(iint, inode, file);
 }
 
-static int process_measurement(struct ima_namespace *ns,
-			       struct file *file, const struct cred *cred,
-			       u32 secid, char *buf, loff_t size, int mask,
-			       enum ima_hooks func)
+static int _process_measurement(struct ima_namespace *ns,
+				struct file *file, const struct cred *cred,
+				u32 secid, char *buf, loff_t size, int mask,
+				enum ima_hooks func)
 {
 	struct inode *inode = file_inode(file);
 	struct integrity_iint_cache *iint = NULL;
@@ -404,6 +404,27 @@ static int process_measurement(struct ima_namespace *ns,
 	return 0;
 }
 
+static int process_measurement(struct ima_namespace *ns,
+			       struct file *file, const struct cred *cred,
+			       u32 secid, char *buf, loff_t size, int mask,
+			       enum ima_hooks func)
+{
+	int ret = 0;
+	struct user_namespace *user_ns;
+
+	do {
+		ret = _process_measurement(ns, file, cred, secid, buf, size, mask, func);
+		if (ret)
+			break;
+		user_ns = ns->user_ns->parent;
+		if (!user_ns)
+			break;
+		ns = user_ns->ima_ns;
+	} while (1);
+
+	return ret;
+}
+
 /**
  * ima_file_mmap - based on policy, collect/store measurement.
  * @file: pointer to the file to be measured (May be NULL)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 11/20] securityfs: Prefix global variables with securityfs_
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (9 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 10/20] ima: Implement hierarchical processing of file accesses Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 12/20] securityfs: Pass static variables as parameters from top level functions Stefan Berger
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Prefix global variables 'mount' and 'mount_count' with securityfs_ so they
are easier to distinguish as variables belonging to securityfs rather than
variables being passed in through new APIs we will introduce.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 security/inode.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/security/inode.c b/security/inode.c
index 6c326939750d..e523829c22cb 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -22,8 +22,8 @@
 #include <linux/lsm_hooks.h>
 #include <linux/magic.h>
 
-static struct vfsmount *mount;
-static int mount_count;
+static struct vfsmount *securityfs_mount;
+static int securityfs_mount_count;
 
 static void securityfs_free_inode(struct inode *inode)
 {
@@ -66,7 +66,7 @@ static int securityfs_init_fs_context(struct fs_context *fc)
 	return 0;
 }
 
-static struct file_system_type fs_type = {
+static struct file_system_type securityfs_type = {
 	.owner =	THIS_MODULE,
 	.name =		"securityfs",
 	.init_fs_context = securityfs_init_fs_context,
@@ -118,12 +118,12 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 
 	pr_debug("securityfs: creating file '%s'\n",name);
 
-	error = simple_pin_fs(&fs_type, &mount, &mount_count);
+	error = simple_pin_fs(&securityfs_type, &securityfs_mount, &securityfs_mount_count);
 	if (error)
 		return ERR_PTR(error);
 
 	if (!parent)
-		parent = mount->mnt_root;
+		parent = securityfs_mount->mnt_root;
 
 	dir = d_inode(parent);
 
@@ -168,7 +168,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 	dentry = ERR_PTR(error);
 out:
 	inode_unlock(dir);
-	simple_release_fs(&mount, &mount_count);
+	simple_release_fs(&securityfs_mount, &securityfs_mount_count);
 	return dentry;
 }
 
@@ -309,7 +309,7 @@ void securityfs_remove(struct dentry *dentry)
 		dput(dentry);
 	}
 	inode_unlock(dir);
-	simple_release_fs(&mount, &mount_count);
+	simple_release_fs(&securityfs_mount, &securityfs_mount_count);
 }
 EXPORT_SYMBOL_GPL(securityfs_remove);
 
@@ -336,7 +336,7 @@ static int __init securityfs_init(void)
 	if (retval)
 		return retval;
 
-	retval = register_filesystem(&fs_type);
+	retval = register_filesystem(&securityfs_type);
 	if (retval) {
 		sysfs_remove_mount_point(kernel_kobj, "security");
 		return retval;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 12/20] securityfs: Pass static variables as parameters from top level functions
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (10 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 11/20] securityfs: Prefix global variables with securityfs_ Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 13/20] securityfs: Build securityfs_ns for namespacing support Stefan Berger
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Pass the securityfs_-prefixed static variables from current top level
functions so that new APIs allow callers to pass in similar parameters and
thus share most of the existing functions.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 security/inode.c | 95 +++++++++++++++++++++++++++++++-----------------
 1 file changed, 61 insertions(+), 34 deletions(-)

diff --git a/security/inode.c b/security/inode.c
index e523829c22cb..429744ff4ab3 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -88,6 +88,11 @@ static struct file_system_type securityfs_type = {
  *        this file.
  * @iops: a point to a struct of inode_operations that should be used for
  *        this file/dir
+ * @mount: a pointer to a pointer for existing vfsmount to use or for
+ *         one to create
+ * @mount_count: pointer to integer for mount_count that goes along with
+ *               @mount
+ *
  *
  * This is the basic "create a file/dir/symlink" function for
  * securityfs.  It allows for a wide range of flexibility in creating
@@ -107,7 +112,9 @@ static struct file_system_type securityfs_type = {
 static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 					struct dentry *parent, void *data,
 					const struct file_operations *fops,
-					const struct inode_operations *iops)
+					const struct inode_operations *iops,
+					struct file_system_type *fs_type,
+					struct vfsmount **mount, int *mount_count)
 {
 	struct dentry *dentry;
 	struct inode *dir, *inode;
@@ -118,12 +125,12 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 
 	pr_debug("securityfs: creating file '%s'\n",name);
 
-	error = simple_pin_fs(&securityfs_type, &securityfs_mount, &securityfs_mount_count);
+	error = simple_pin_fs(fs_type, mount, mount_count);
 	if (error)
 		return ERR_PTR(error);
 
 	if (!parent)
-		parent = securityfs_mount->mnt_root;
+		parent = (*mount)->mnt_root;
 
 	dir = d_inode(parent);
 
@@ -168,7 +175,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 	dentry = ERR_PTR(error);
 out:
 	inode_unlock(dir);
-	simple_release_fs(&securityfs_mount, &securityfs_mount_count);
+	simple_release_fs(mount, mount_count);
 	return dentry;
 }
 
@@ -201,7 +208,9 @@ struct dentry *securityfs_create_file(const char *name, umode_t mode,
 				      struct dentry *parent, void *data,
 				      const struct file_operations *fops)
 {
-	return securityfs_create_dentry(name, mode, parent, data, fops, NULL);
+	return securityfs_create_dentry(name, mode, parent, data, fops, NULL,
+					&securityfs_type, &securityfs_mount,
+					&securityfs_mount_count);
 }
 EXPORT_SYMBOL_GPL(securityfs_create_file);
 
@@ -231,6 +240,29 @@ struct dentry *securityfs_create_dir(const char *name, struct dentry *parent)
 }
 EXPORT_SYMBOL_GPL(securityfs_create_dir);
 
+struct dentry *_securityfs_create_symlink(const char *name,
+					  struct dentry *parent,
+					  const char *target,
+					  const struct inode_operations *iops,
+					  struct file_system_type *fs_type,
+					  struct vfsmount **mount, int *mount_count)
+{
+	struct dentry *dent;
+	char *link = NULL;
+
+	if (target) {
+		link = kstrdup(target, GFP_KERNEL);
+		if (!link)
+			return ERR_PTR(-ENOMEM);
+	}
+	dent = securityfs_create_dentry(name, S_IFLNK | 0444, parent,
+					link, NULL, iops, fs_type,
+					mount, mount_count);
+	if (IS_ERR(dent))
+		kfree(link);
+
+	return dent;
+}
 /**
  * securityfs_create_symlink - create a symlink in the securityfs filesystem
  *
@@ -262,37 +294,13 @@ struct dentry *securityfs_create_symlink(const char *name,
 					 const char *target,
 					 const struct inode_operations *iops)
 {
-	struct dentry *dent;
-	char *link = NULL;
-
-	if (target) {
-		link = kstrdup(target, GFP_KERNEL);
-		if (!link)
-			return ERR_PTR(-ENOMEM);
-	}
-	dent = securityfs_create_dentry(name, S_IFLNK | 0444, parent,
-					link, NULL, iops);
-	if (IS_ERR(dent))
-		kfree(link);
-
-	return dent;
+	return _securityfs_create_symlink(name, parent, target, iops,
+					  &securityfs_type, &securityfs_mount,
+					  &securityfs_mount_count);
 }
 EXPORT_SYMBOL_GPL(securityfs_create_symlink);
 
-/**
- * securityfs_remove - removes a file or directory from the securityfs filesystem
- *
- * @dentry: a pointer to a the dentry of the file or directory to be removed.
- *
- * This function removes a file or directory in securityfs that was previously
- * created with a call to another securityfs function (like
- * securityfs_create_file() or variants thereof.)
- *
- * This function is required to be called in order for the file to be
- * removed. No automatic cleanup of files will happen when a module is
- * removed; you are responsible here.
- */
-void securityfs_remove(struct dentry *dentry)
+void _securityfs_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
 {
 	struct inode *dir;
 
@@ -309,8 +317,27 @@ void securityfs_remove(struct dentry *dentry)
 		dput(dentry);
 	}
 	inode_unlock(dir);
-	simple_release_fs(&securityfs_mount, &securityfs_mount_count);
+	simple_release_fs(mount, mount_count);
+}
+
+/**
+ * securityfs_remove - removes a file or directory from the securityfs filesystem
+ *
+ * @dentry: a pointer to a the dentry of the file or directory to be removed.
+ *
+ * This function removes a file or directory in securityfs that was previously
+ * created with a call to another securityfs function (like
+ * securityfs_create_file() or variants thereof.)
+ *
+ * This function is required to be called in order for the file to be
+ * removed. No automatic cleanup of files will happen when a module is
+ * removed; you are responsible here.
+ */
+void securityfs_remove(struct dentry *dentry)
+{
+	_securityfs_remove(dentry, &securityfs_mount, &securityfs_mount_count);
 }
+
 EXPORT_SYMBOL_GPL(securityfs_remove);
 
 #ifdef CONFIG_SECURITY
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 13/20] securityfs: Build securityfs_ns for namespacing support
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (11 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 12/20] securityfs: Pass static variables as parameters from top level functions Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-12-02 13:35   ` Christian Brauner
  2021-11-30 16:06 ` [RFC 14/20] ima: Move some IMA policy and filesystem related variables into ima_namespace Stefan Berger
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Implement 'securityfs_ns' for support of IMA namespacing so that each
IMA (user) namespace can have its own front-end for showing the currently
active policy, the measurement list, number of violations and so on. This
filesystem shares much of the existing code of SecurityFS but requires a
new API call securityfs_ns_create_mount() for creating a new instance.

The API calls of securityfs_ns have the prefix securityfs_ns_ and take
additional parameters struct vfsmount * and mount_count that allow for
multiple instances of this filesystem to exist.

The filesystem can be mounted to the usual securityfs mount point like
this:

mount -t securityfs_ns /sys/kernel/security /sys/kernel/security

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/security.h   |  18 ++++
 include/uapi/linux/magic.h |   1 +
 security/inode.c           | 197 +++++++++++++++++++++++++++++++++++--
 3 files changed, 210 insertions(+), 6 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index 7e0ba63b5dde..8e479266f544 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1929,6 +1929,24 @@ struct dentry *securityfs_create_symlink(const char *name,
 					 const struct inode_operations *iops);
 extern void securityfs_remove(struct dentry *dentry);
 
+extern struct dentry *securityfs_ns_create_file(const char *name, umode_t mode,
+						struct dentry *parent, void *data,
+						const struct file_operations *fops,
+						const struct inode_operations *iops,
+						struct vfsmount **mount, int *mount_count);
+extern struct dentry *securityfs_ns_create_dir(const char *name, struct dentry *parent,
+					       const struct inode_operations *iops,
+					       struct vfsmount **mount, int *mount_count);
+struct dentry *securityfs_ns_create_symlink(const char *name,
+					    struct dentry *parent,
+					    const char *target,
+					    const struct inode_operations *iops,
+					    struct vfsmount **mount, int *mount_count);
+extern void securityfs_ns_remove(struct dentry *dentry,
+				 struct vfsmount **mount, int *mount_count);
+struct vfsmount *securityfs_ns_create_mount(struct user_namespace *user_ns);
+extern struct vfsmount *securityfs_ns_mount;
+
 #else /* CONFIG_SECURITYFS */
 
 static inline struct dentry *securityfs_create_dir(const char *name,
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 35687dcb1a42..5c1cc6088dd2 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -11,6 +11,7 @@
 #define CRAMFS_MAGIC_WEND	0x453dcd28	/* magic number with the wrong endianess */
 #define DEBUGFS_MAGIC          0x64626720
 #define SECURITYFS_MAGIC	0x73636673
+#define SECURITYFS_NS_MAGIC	0x73334473
 #define SELINUX_MAGIC		0xf97cff8c
 #define SMACK_MAGIC		0x43415d53	/* "SMAC" */
 #define RAMFS_MAGIC		0x858458f6	/* some random number */
diff --git a/security/inode.c b/security/inode.c
index 429744ff4ab3..8077d1f31489 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -21,6 +21,7 @@
 #include <linux/security.h>
 #include <linux/lsm_hooks.h>
 #include <linux/magic.h>
+#include <linux/user_namespace.h>
 
 static struct vfsmount *securityfs_mount;
 static int securityfs_mount_count;
@@ -73,6 +74,61 @@ static struct file_system_type securityfs_type = {
 	.kill_sb =	kill_litter_super,
 };
 
+static int securityfs_ns_fill_super(struct super_block *sb, struct fs_context *fc)
+{
+	static const struct tree_descr files[] = {{""}};
+	int error;
+
+	error = simple_fill_super(sb, SECURITYFS_NS_MAGIC, files);
+	if (error)
+		return error;
+
+	sb->s_op = &securityfs_super_operations;
+
+	return 0;
+}
+
+static int securityfs_ns_get_tree(struct fs_context *fc)
+{
+	return get_tree_keyed(fc, securityfs_ns_fill_super, fc->user_ns);
+}
+
+static const struct fs_context_operations securityfs_ns_context_ops = {
+	.get_tree	= securityfs_ns_get_tree,
+};
+
+static int securityfs_ns_init_fs_context(struct fs_context *fc)
+{
+	fc->ops = &securityfs_ns_context_ops;
+	return 0;
+}
+
+static struct file_system_type securityfs_ns_type = {
+	.owner			= THIS_MODULE,
+	.name			= "securityfs_ns",
+	.init_fs_context	= securityfs_ns_init_fs_context,
+	.kill_sb		= kill_litter_super,
+	.fs_flags		= FS_USERNS_MOUNT,
+};
+
+struct vfsmount *securityfs_ns_create_mount(struct user_namespace *user_ns)
+{
+	struct fs_context *fc;
+	struct vfsmount *mnt;
+
+	fc = fs_context_for_mount(&securityfs_ns_type, SB_KERNMOUNT);
+	if (IS_ERR(fc))
+		return ERR_CAST(fc);
+
+	put_user_ns(fc->user_ns);
+	fc->user_ns = get_user_ns(user_ns);
+
+	mnt = fc_mount(fc);
+	put_fs_context(fc);
+	return mnt;
+}
+
+
 /**
  * securityfs_create_dentry - create a dentry in the securityfs filesystem
  *
@@ -155,8 +211,8 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
 	inode->i_private = data;
 	if (S_ISDIR(mode)) {
-		inode->i_op = &simple_dir_inode_operations;
-		inode->i_fop = &simple_dir_operations;
+		inode->i_op = iops ? iops : &simple_dir_inode_operations;
+		inode->i_fop = fops ? fops : &simple_dir_operations;
 		inc_nlink(inode);
 		inc_nlink(dir);
 	} else if (S_ISLNK(mode)) {
@@ -214,6 +270,41 @@ struct dentry *securityfs_create_file(const char *name, umode_t mode,
 }
 EXPORT_SYMBOL_GPL(securityfs_create_file);
 
+/**
+ * securityfs_ns_create_file - create a file in the securityfs_ns filesystem
+ *
+ * @name: a pointer to a string containing the name of the file to create.
+ * @mode: the permission that the file should have
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is %NULL, then the
+ *          file will be created in the root of the securityfs_ns filesystem.
+ * @data: a pointer to something that the caller will want to get to later
+ *        on.  The inode.i_private pointer will point to this value on
+ *        the open() call.
+ * @fops: a pointer to a struct file_operations that should be used for
+ *        this file.
+ * @mount: Pointer to a pointer of a an existing vfsmount
+ * @mount_count: The mount_count that goes along with the @mount
+ *
+ * This function creates a file in securityfs_ns with the given @name.
+ *
+ * This function returns a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the securityfs_ns_remove() function when the file
+ * is to be removed (no automatic cleanup happens if your module is unloaded,
+ * you are responsible here).  If an error occurs, the function will return
+ * the error value (via ERR_PTR).
+ */
+struct dentry *securityfs_ns_create_file(const char *name, umode_t mode,
+					 struct dentry *parent, void *data,
+					 const struct file_operations *fops,
+					 const struct inode_operations *iops,
+					 struct vfsmount **mount, int *mount_count)
+{
+	return securityfs_create_dentry(name, mode, parent, data, fops, iops,
+					&securityfs_ns_type, mount, mount_count);
+}
+EXPORT_SYMBOL_GPL(securityfs_ns_create_file);
+
 /**
  * securityfs_create_dir - create a directory in the securityfs filesystem
  *
@@ -240,6 +331,34 @@ struct dentry *securityfs_create_dir(const char *name, struct dentry *parent)
 }
 EXPORT_SYMBOL_GPL(securityfs_create_dir);
 
+/**
+ * securityfs_ns_create_dir - create a directory in the securityfs_ns filesystem
+ *
+ * @name: a pointer to a string containing the name of the directory to
+ *        create.
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is %NULL, then the
+ *          directory will be created in the root of the securityfs_ns filesystem.
+ * @mount: Pointer to a pointer of a an existing vfsmount
+ * @mount_count: The mount_count that goes along with the @mount
+ *
+ * This function creates a directory in securityfs_ns with the given @name.
+ *
+ * This function returns a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the securityfs_ns_remove() function when the file
+ * is to be removed (no automatic cleanup happens if your module is unloaded,
+ * you are responsible here).  If an error occurs, the function will return
+ * the error value (via ERR_PTR).
+ */
+struct dentry *securityfs_ns_create_dir(const char *name, struct dentry *parent,
+					const struct inode_operations *iops,
+					struct vfsmount **mount, int *mount_count)
+{
+	return securityfs_ns_create_file(name, S_IFDIR | 0755, parent, NULL, NULL,
+					 iops, mount, mount_count);
+}
+EXPORT_SYMBOL_GPL(securityfs_ns_create_dir);
+
 struct dentry *_securityfs_create_symlink(const char *name,
 					  struct dentry *parent,
 					  const char *target,
@@ -263,6 +382,7 @@ struct dentry *_securityfs_create_symlink(const char *name,
 
 	return dent;
 }
+
 /**
  * securityfs_create_symlink - create a symlink in the securityfs filesystem
  *
@@ -300,6 +420,42 @@ struct dentry *securityfs_create_symlink(const char *name,
 }
 EXPORT_SYMBOL_GPL(securityfs_create_symlink);
 
+/**
+ * securityfs_ns_create_symlink - create a symlink in the securityfs_ns filesystem
+ *
+ * @name: a pointer to a string containing the name of the symlink to
+ *        create.
+ * @parent: a pointer to the parent dentry for the symlink.  This should be a
+ *          directory dentry if set.  If this parameter is %NULL, then the
+ *          directory will be created in the root of the securityfs_ns filesystem.
+ * @target: a pointer to a string containing the name of the symlink's target.
+ *          If this parameter is %NULL, then the @iops parameter needs to be
+ *          setup to handle .readlink and .get_link inode_operations.
+ * @iops: a pointer to the struct inode_operations to use for the symlink. If
+ *        this parameter is %NULL, then the default simple_symlink_inode
+ *        operations will be used.
+ * @mount: Pointer to a pointer of a an existing vfsmount
+ * @mount_count: The mount_count that goes along with the @mount
+ *
+ * This function creates a symlink in securityfs_ns with the given @name.
+ *
+ * This function returns a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the securityfs_ns_remove() function when the file
+ * is to be removed (no automatic cleanup happens if your module is unloaded,
+ * you are responsible here).  If an error occurs, the function will return
+ * the error value (via ERR_PTR).
+ */
+struct dentry *securityfs_ns_create_symlink(const char *name,
+					    struct dentry *parent,
+					    const char *target,
+					    const struct inode_operations *iops,
+					    struct vfsmount **mount, int *mount_count)
+{
+	return _securityfs_create_symlink(name, parent, target, iops,
+					  &securityfs_ns_type, mount, mount_count);
+}
+EXPORT_SYMBOL_GPL(securityfs_ns_create_symlink);
+
 void _securityfs_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
 {
 	struct inode *dir;
@@ -340,6 +496,27 @@ void securityfs_remove(struct dentry *dentry)
 
 EXPORT_SYMBOL_GPL(securityfs_remove);
 
+/**
+ * securityfs_ns_remove - removes a file or directory from the securityfs_ns filesystem
+ *
+ * @dentry: a pointer to a the dentry of the file or directory to be removed.
+ * @mount: Pointer to a pointer of a an existing vfsmount
+ * @mount_count: The mount_count that goes along with the @mount
+ *
+ * This function removes a file or directory in securityfs_ns that was previously
+ * created with a call to another securityfs_ns function (like
+ * securityfs_ns_create_file() or variants thereof.)
+ *
+ * This function is required to be called in order for the file to be
+ * removed. No automatic cleanup of files will happen when a module is
+ * removed; you are responsible here.
+ */
+void securityfs_ns_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
+{
+	_securityfs_remove(dentry, mount, mount_count);
+}
+EXPORT_SYMBOL_GPL(securityfs_ns_remove);
+
 #ifdef CONFIG_SECURITY
 static struct dentry *lsm_dentry;
 static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
@@ -364,14 +541,22 @@ static int __init securityfs_init(void)
 		return retval;
 
 	retval = register_filesystem(&securityfs_type);
-	if (retval) {
-		sysfs_remove_mount_point(kernel_kobj, "security");
-		return retval;
-	}
+	if (retval)
+		goto remove_mount;
+	retval = register_filesystem(&securityfs_ns_type);
+	if (retval)
+		goto unregister_filesystem;
 #ifdef CONFIG_SECURITY
 	lsm_dentry = securityfs_create_file("lsm", 0444, NULL, NULL,
 						&lsm_ops);
 #endif
 	return 0;
+
+unregister_filesystem:
+	unregister_filesystem(&securityfs_type);
+remove_mount:
+	sysfs_remove_mount_point(kernel_kobj, "security");
+
+	return retval;
 }
 core_initcall(securityfs_init);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 14/20] ima: Move some IMA policy and filesystem related variables into ima_namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (12 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 13/20] securityfs: Build securityfs_ns for namespacing support Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN Stefan Berger
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Move the ima_write_mutex, ima_fs_flag, and valid_policy variables into
ima_namespace. This way each IMA namespace can set those variables
independently.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                      |  5 ++++
 security/integrity/ima/ima_fs.c          | 35 +++++++++++-------------
 security/integrity/ima/ima_init_ima_ns.c |  4 +++
 3 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 850a513834d2..fe08919df326 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -262,6 +262,11 @@ struct ima_namespace {
 	struct ima_h_table ima_htable;
 	struct list_head ima_measurements;
 	unsigned long binary_runtime_size;
+
+	/* IMA's filesystem */
+	struct mutex ima_write_mutex;
+	unsigned long ima_fs_flags;
+	int valid_policy;
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index c35e15fb313f..6c86f81c9998 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -25,8 +25,6 @@
 
 #include "ima.h"
 
-static DEFINE_MUTEX(ima_write_mutex);
-
 bool ima_canonical_fmt;
 static int __init default_canonical_fmt_setup(char *str)
 {
@@ -37,8 +35,6 @@ static int __init default_canonical_fmt_setup(char *str)
 }
 __setup("ima_canonical_fmt", default_canonical_fmt_setup);
 
-static int valid_policy = 1;
-
 static ssize_t ima_show_htable_value(char __user *buf, size_t count,
 				     loff_t *ppos, atomic_long_t *val)
 {
@@ -320,6 +316,7 @@ static ssize_t ima_read_policy(char *path)
 static ssize_t ima_write_policy(struct file *file, const char __user *buf,
 				size_t datalen, loff_t *ppos)
 {
+	struct ima_namespace *ns = get_current_ns();
 	char *data;
 	ssize_t result;
 
@@ -337,7 +334,7 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	result = mutex_lock_interruptible(&ima_write_mutex);
+	result = mutex_lock_interruptible(&ns->ima_write_mutex);
 	if (result < 0)
 		goto out_free;
 
@@ -350,14 +347,14 @@ static ssize_t ima_write_policy(struct file *file, const char __user *buf,
 				    1, 0);
 		result = -EACCES;
 	} else {
-		result = ima_parse_add_rule(get_current_ns(), data);
+		result = ima_parse_add_rule(ns, data);
 	}
-	mutex_unlock(&ima_write_mutex);
+	mutex_unlock(&ns->ima_write_mutex);
 out_free:
 	kfree(data);
 out:
 	if (result < 0)
-		valid_policy = 0;
+		ns->valid_policy = 0;
 
 	return result;
 }
@@ -374,8 +371,6 @@ enum ima_fs_flags {
 	IMA_FS_BUSY,
 };
 
-static unsigned long ima_fs_flags;
-
 #ifdef	CONFIG_IMA_READ_POLICY
 static const struct seq_operations ima_policy_seqops = {
 		.start = ima_policy_start,
@@ -390,6 +385,8 @@ static const struct seq_operations ima_policy_seqops = {
  */
 static int ima_open_policy(struct inode *inode, struct file *filp)
 {
+	struct ima_namespace *ns = get_current_ns();
+
 	if (!(filp->f_flags & O_WRONLY)) {
 #ifndef	CONFIG_IMA_READ_POLICY
 		return -EACCES;
@@ -401,7 +398,7 @@ static int ima_open_policy(struct inode *inode, struct file *filp)
 		return seq_open(filp, &ima_policy_seqops);
 #endif
 	}
-	if (test_and_set_bit(IMA_FS_BUSY, &ima_fs_flags))
+	if (test_and_set_bit(IMA_FS_BUSY, &ns->ima_fs_flags))
 		return -EBUSY;
 	return 0;
 }
@@ -415,25 +412,25 @@ static int ima_open_policy(struct inode *inode, struct file *filp)
  */
 static int ima_release_policy(struct inode *inode, struct file *file)
 {
-	const char *cause = valid_policy ? "completed" : "failed";
 	struct ima_namespace *ns = get_current_ns();
+	const char *cause = ns->valid_policy ? "completed" : "failed";
 
 	if ((file->f_flags & O_ACCMODE) == O_RDONLY)
 		return seq_release(inode, file);
 
-	if (valid_policy && ima_check_policy(ns) < 0) {
+	if (ns->valid_policy && ima_check_policy(ns) < 0) {
 		cause = "failed";
-		valid_policy = 0;
+		ns->valid_policy = 0;
 	}
 
 	pr_info("policy update %s\n", cause);
 	integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
-			    "policy_update", cause, !valid_policy, 0);
+			    "policy_update", cause, !ns->valid_policy, 0);
 
-	if (!valid_policy) {
+	if (!ns->valid_policy) {
 		ima_delete_rules(ns);
-		valid_policy = 1;
-		clear_bit(IMA_FS_BUSY, &ima_fs_flags);
+		ns->valid_policy = 1;
+		clear_bit(IMA_FS_BUSY, &ns->ima_fs_flags);
 		return 0;
 	}
 
@@ -442,7 +439,7 @@ static int ima_release_policy(struct inode *inode, struct file *file)
 	securityfs_remove(ima_policy);
 	ima_policy = NULL;
 #elif defined(CONFIG_IMA_WRITE_POLICY)
-	clear_bit(IMA_FS_BUSY, &ima_fs_flags);
+	clear_bit(IMA_FS_BUSY, &ns->ima_fs_flags);
 #elif defined(CONFIG_IMA_READ_POLICY)
 	inode->i_mode &= ~S_IWUSR;
 #endif
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 57e46a10c001..22ff74e85a5f 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -49,6 +49,10 @@ int ima_init_namespace(struct ima_namespace *ns)
 	else
 		ns->binary_runtime_size = ULONG_MAX;
 
+	mutex_init(&ns->ima_write_mutex);
+	ns->valid_policy = 1;
+	ns->ima_fs_flags = 0;
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (13 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 14/20] ima: Move some IMA policy and filesystem related variables into ima_namespace Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 17:27   ` Casey Schaufler
  2021-11-30 16:06 ` [RFC 16/20] ima: Use ns_capable() for namespace policy access Stefan Berger
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin, Stefan Berger

From: Denis Semakin <denis.semakin@huawei.com>

This patch introduces CAP_INTEGRITY_ADMIN, a new capability that allows
to setup IMA (Integrity Measurement Architecture) policies per container
for non-root users.

The main purpose of this new capability is discribed in this document:
https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
It is said: "setting the policy should be possibly without the powerful
CAP_SYS_ADMIN and there should be the opportunity to gate this with a new
capability CAP_INTEGRITY_ADMIN that allows a user to set the IMA policy
during container runtime.."

In other words it should be possible to setup IMA policies while not
giving too many privilges to the user, therefore splitting the
CAP_INTEGRITY_ADMIN off from CAP_SYS_ADMIN.

Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/capability.h          | 6 ++++++
 include/uapi/linux/capability.h     | 7 ++++++-
 security/selinux/include/classmap.h | 4 ++--
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index 65efb74c3585..ea6d58acb95e 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -278,4 +278,10 @@ int get_vfs_caps_from_disk(struct user_namespace *mnt_userns,
 int cap_convert_nscap(struct user_namespace *mnt_userns, struct dentry *dentry,
 		      const void **ivalue, size_t size);
 
+static inline bool integrity_admin_ns_capable(struct user_namespace *ns)
+{
+	return ns_capable(ns, CAP_INTEGRITY_ADMIN) ||
+		ns_capable(ns, CAP_SYS_ADMIN);
+}
+
 #endif /* !_LINUX_CAPABILITY_H */
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 463d1ba2232a..48b08e4b3895 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -417,7 +417,12 @@ struct vfs_ns_cap_data {
 
 #define CAP_CHECKPOINT_RESTORE	40
 
-#define CAP_LAST_CAP         CAP_CHECKPOINT_RESTORE
+/* Allow setup IMA policy per container independently */
+/* No necessary to be superuser */
+
+#define CAP_INTEGRITY_ADMIN	41
+
+#define CAP_LAST_CAP		CAP_INTEGRITY_ADMIN
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 35aac62a662e..7ff532b90f09 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -28,9 +28,9 @@
 
 #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
 		"wake_alarm", "block_suspend", "audit_read", "perfmon", "bpf", \
-		"checkpoint_restore"
+		"checkpoint_restore", "integrity_admin"
 
-#if CAP_LAST_CAP > CAP_CHECKPOINT_RESTORE
+#if CAP_LAST_CAP > CAP_INTEGRITY_ADMIN
 #error New capability defined, please update COMMON_CAP2_PERMS.
 #endif
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 16/20] ima: Use ns_capable() for namespace policy access
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (14 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability Stefan Berger
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Replace capable() with ns_capable() for IMA namespace policy access with
the CAP_SYS_ADMIN permission.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 security/integrity/ima/ima_fs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 6c86f81c9998..fd2798f2d224 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -393,7 +393,7 @@ static int ima_open_policy(struct inode *inode, struct file *filp)
 #else
 		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
 			return -EACCES;
-		if (!capable(CAP_SYS_ADMIN))
+		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
 			return -EPERM;
 		return seq_open(filp, &ima_policy_seqops);
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (15 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 16/20] ima: Use ns_capable() for namespace policy access Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-12-01 16:58   ` James Bottomley
  2021-11-30 16:06 ` [RFC 18/20] userns: Introduce a refcount variable for calling early teardown function Stefan Berger
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin

From: Denis Semakin <denis.semakin@huawei.com>

Use integrity_admin_ns_capable() to check corresponding capability to
allow read/write IMA policy without CAP_SYS_ADMIN but with
CAP_INTEGRITY_ADMIN.

Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
---
 security/integrity/ima/ima_fs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index fd2798f2d224..6766bb8262f2 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -393,7 +393,7 @@ static int ima_open_policy(struct inode *inode, struct file *filp)
 #else
 		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
 			return -EACCES;
-		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
+		if (!integrity_admin_ns_capable(ns->user_ns))
 			return -EPERM;
 		return seq_open(filp, &ima_policy_seqops);
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 18/20] userns: Introduce a refcount variable for calling early teardown function
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (16 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 19/20] ima/userns: Define early teardown function for IMA namespace Stefan Berger
  2021-11-30 16:06 ` [RFC 20/20] ima: Setup securityfs_ns " Stefan Berger
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Extend the user_namespace structure with a refcount_teardown variable to
cause an early teardown function to be invoked. This allows the IMA
namespace to initialize a filesystem that holds one additional reference
to the user namespace it 'belongs' to. Therefore, the refount_teardown
variable will be incremented by '1' once that additional reference has
been created. Once the user namespace's reference counter is decremented
to '1', this early teardown function is invoked and the additional user
namespace reference released and the actual deletion of the user
namespace can then proceed as usual.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/user_namespace.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 5249db04d62b..505e3b3748b6 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -103,6 +103,11 @@ struct user_namespace {
 #ifdef CONFIG_IMA
 	struct ima_namespace	*ima_ns;
 #endif
+	/* The refcount at which to start tearing down dependent namespaces
+	 * (currently only IMA) that may hold additional references to the
+	 * user namespace.
+	 */
+	unsigned int            refcount_teardown;
 } __randomize_layout;
 
 struct ucounts {
@@ -156,8 +161,12 @@ extern void __put_user_ns(struct user_namespace *ns);
 
 static inline void put_user_ns(struct user_namespace *ns)
 {
-	if (ns && refcount_dec_and_test(&ns->ns.count))
-		__put_user_ns(ns);
+	if (ns) {
+		if (refcount_dec_and_test(&ns->ns.count))
+			__put_user_ns(ns);
+		else if (refcount_read(&ns->ns.count) == ns->refcount_teardown)
+			;
+	}
 }
 
 struct seq_operations;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 19/20] ima/userns: Define early teardown function for IMA namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (17 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 18/20] userns: Introduce a refcount variable for calling early teardown function Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-11-30 16:06 ` [RFC 20/20] ima: Setup securityfs_ns " Stefan Berger
  19 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Define an early teardown function ima_ns_userns_early_teardown() that
will be needed for early teardown of the securityfs_ns of an IMA name-
space since this holds one additional references to the user namespace.

This function is not called yet since the refcount_teardown variable at
this point is always 0.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/user_namespace.h  | 8 ++++++--
 security/integrity/ima/ima_ns.c | 6 ++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 505e3b3748b6..6bc178d4c6e2 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -158,14 +158,18 @@ static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 extern int create_user_ns(struct cred *new);
 extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
 extern void __put_user_ns(struct user_namespace *ns);
+extern void ima_ns_userns_early_teardown(struct ima_namespace *ima_ns);
 
 static inline void put_user_ns(struct user_namespace *ns)
 {
 	if (ns) {
 		if (refcount_dec_and_test(&ns->ns.count))
 			__put_user_ns(ns);
-		else if (refcount_read(&ns->ns.count) == ns->refcount_teardown)
-			;
+		else if (refcount_read(&ns->ns.count) == ns->refcount_teardown) {
+#ifdef CONFIG_IMA_NS
+			ima_ns_userns_early_teardown(ns->ima_ns);
+#endif
+		}
 	}
 }
 
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index e4f4cf84a6b5..e7ad52b79f99 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -16,6 +16,7 @@
 #include <linux/mount.h>
 #include <linux/proc_ns.h>
 #include <linux/lsm_hooks.h>
+#include <linux/user_namespace.h>
 
 #include "ima.h"
 
@@ -64,6 +65,11 @@ struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
 	return create_ima_ns(user_ns);
 }
 
+void ima_ns_userns_early_teardown(struct ima_namespace *ns)
+{
+}
+EXPORT_SYMBOL(ima_ns_userns_early_teardown);
+
 static void destroy_ima_ns(struct ima_namespace *ns)
 {
 	pr_debug("DESTROY ima_ns: 0x%p\n", ns);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
                   ` (18 preceding siblings ...)
  2021-11-30 16:06 ` [RFC 19/20] ima/userns: Define early teardown function for IMA namespace Stefan Berger
@ 2021-11-30 16:06 ` Stefan Berger
  2021-12-01 17:56   ` James Bottomley
  2021-12-02 13:18   ` Christian Brauner
  19 siblings, 2 replies; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 16:06 UTC (permalink / raw)
  To: linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Stefan Berger

Setup securityfs_ns with symlinks, directories, and files for IMA
namespacing support. The same directory structure that IMA uses on the
host is also created for the namespacing case.

Increment the user namespace's refcount_teardown value by '1' once
securityfs_ns has been successfully setup since the initialization of the
filesystem causes an additional reference to the user namespace to be
taken. The early teardown function will delete the file system and release
the additional reference.

The securityfs_ns file and directory ownerships cannot be set when the
filesystem is setup since at this point the user namespace has not been
configured yet by the user and therefore the ownership mappings are not
available, yet. Therefore, adjust the file and directory ownerships when
an inode's function for determining the permissions of a file or directory
is accessed.

This filesystem can now be mounted as follows:

mount -t securityfs_ns /sys/kernel/security/ /sys/kernel/security/

The following directories, symlinks, and files are then available.

$ ls -l sys/kernel/security/
total 0
lr--r--r--. 1 nobody nobody 0 Nov 27 06:44 ima -> integrity/ima
drwxr-xr-x. 3 nobody nobody 0 Nov 27 06:44 integrity

$ ls -l sys/kernel/security/ima/
total 0
-r--r-----. 1 root root 0 Nov 27 06:44 ascii_runtime_measurements
-r--r-----. 1 root root 0 Nov 27 06:44 binary_runtime_measurements
-rw-------. 1 root root 0 Nov 27 06:44 policy
-r--r-----. 1 root root 0 Nov 27 06:44 runtime_measurements_count
-r--r-----. 1 root root 0 Nov 27 06:44 violations

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 include/linux/ima.h                      |  17 +++
 security/integrity/ima/ima.h             |   2 +
 security/integrity/ima/ima_fs.c          | 178 ++++++++++++++++++++++-
 security/integrity/ima/ima_init_ima_ns.c |   6 +-
 security/integrity/ima/ima_ns.c          |   4 +-
 5 files changed, 203 insertions(+), 4 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index fe08919df326..a2c5e516f706 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -221,6 +221,18 @@ struct ima_h_table {
 	struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
 };
 
+enum {
+	IMAFS_DENTRY_INTEGRITY_DIR = 0,
+	IMAFS_DENTRY_DIR,
+	IMAFS_DENTRY_SYMLINK,
+	IMAFS_DENTRY_BINARY_RUNTIME_MEASUREMENTS,
+	IMAFS_DENTRY_ASCII_RUNTIME_MEASUREMENTS,
+	IMAFS_DENTRY_RUNTIME_MEASUREMENTS_COUNT,
+	IMAFS_DENTRY_VIOLATIONS,
+	IMAFS_DENTRY_IMA_POLICY,
+	IMAFS_DENTRY_LAST
+};
+
 struct ima_namespace {
 	struct kref kref;
 	struct user_namespace *user_ns;
@@ -267,6 +279,11 @@ struct ima_namespace {
 	struct mutex ima_write_mutex;
 	unsigned long ima_fs_flags;
 	int valid_policy;
+
+	struct dentry *dentry[IMAFS_DENTRY_LAST];
+	struct vfsmount *mount;
+	int mount_count;
+	bool file_ownership_fixes_done;
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index bb9763cd5fb1..9bcd71bb716c 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -139,6 +139,8 @@ struct ns_status {
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
+int ima_fs_ns_init(struct ima_namespace *ns);
+void ima_fs_ns_free(struct ima_namespace *ns);
 int ima_add_template_entry(struct ima_namespace *ns,
 			   struct ima_template_entry *entry, int violation,
 			   const char *op, struct inode *inode,
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 6766bb8262f2..9a14be520268 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -22,6 +22,7 @@
 #include <linux/parser.h>
 #include <linux/vmalloc.h>
 #include <linux/ima.h>
+#include <linux/namei.h>
 
 #include "ima.h"
 
@@ -436,8 +437,13 @@ static int ima_release_policy(struct inode *inode, struct file *file)
 
 	ima_update_policy(ns);
 #if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
-	securityfs_remove(ima_policy);
-	ima_policy = NULL;
+	if (ns == &init_ima_ns) {
+		securityfs_remove(ima_policy);
+		ima_policy = NULL;
+	} else {
+		securityfs_ns_remove(ns->dentry[IMAFS_DENTRY_POLICY]);
+		ns->dentry[IMAFS_DENTRY_POLICY] = NULL;
+	}
 #elif defined(CONFIG_IMA_WRITE_POLICY)
 	clear_bit(IMA_FS_BUSY, &ns->ima_fs_flags);
 #elif defined(CONFIG_IMA_READ_POLICY)
@@ -509,3 +515,171 @@ int __init ima_fs_init(void)
 	securityfs_remove(ima_policy);
 	return -1;
 }
+
+/*
+ * Fix the ownership (uid/gid) of the dentry's that couldn't be set at the
+ * time of their creation because the user namespace wasn't configured, yet.
+ */
+static void ima_fs_ns_fixup_uid_gid(struct ima_namespace *ns)
+{
+	struct inode *inode;
+	size_t i;
+
+	if (ns->file_ownership_fixes_done ||
+	    ns->user_ns->uid_map.nr_extents == 0)
+		return;
+
+	ns->file_ownership_fixes_done = true;
+	for (i = 0; i < IMAFS_DENTRY_LAST; i++) {
+		if (!ns->dentry[i])
+			continue;
+		inode = ns->dentry[i]->d_inode;
+		inode->i_uid = make_kuid(ns->user_ns, 0);
+		inode->i_gid = make_kgid(ns->user_ns, 0);
+	}
+}
+
+/* Fix the permissions when a file is opened */
+int ima_fs_ns_permission(struct user_namespace *mnt_userns, struct inode *inode,
+			 int mask)
+{
+	ima_fs_ns_fixup_uid_gid(get_current_ns());
+	return generic_permission(mnt_userns, inode, mask);
+}
+
+const struct inode_operations ima_fs_ns_inode_operations = {
+	.lookup		= simple_lookup,
+	.permission	= ima_fs_ns_permission,
+};
+
+int ima_fs_ns_init(struct ima_namespace *ns)
+{
+	struct dentry *parent;
+
+	ns->mount = securityfs_ns_create_mount(ns->user_ns);
+	if (IS_ERR(ns->mount)) {
+		ns->mount = NULL;
+		return -1;
+	}
+	ns->mount_count += 1;
+
+	ns->dentry[IMAFS_DENTRY_INTEGRITY_DIR] =
+	    securityfs_ns_create_dir("integrity", NULL,
+				     &ima_fs_ns_inode_operations,
+				     &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_INTEGRITY_DIR])) {
+		ns->dentry[IMAFS_DENTRY_INTEGRITY_DIR] = NULL;
+		goto out;
+	}
+
+	ns->dentry[IMAFS_DENTRY_DIR] =
+	    securityfs_ns_create_dir("ima", ns->dentry[IMAFS_DENTRY_INTEGRITY_DIR],
+				     &ima_fs_ns_inode_operations,
+				     &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_DIR])) {
+		ns->dentry[IMAFS_DENTRY_DIR] = NULL;
+		goto out;
+	}
+
+	ns->dentry[IMAFS_DENTRY_SYMLINK] =
+	    securityfs_ns_create_symlink("ima", NULL, "integrity/ima", NULL,
+				     &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_SYMLINK])) {
+		ns->dentry[IMAFS_DENTRY_SYMLINK] = NULL;
+		goto out;
+	}
+
+	parent = ns->dentry[IMAFS_DENTRY_DIR];
+	ns->dentry[IMAFS_DENTRY_BINARY_RUNTIME_MEASUREMENTS] =
+	    securityfs_ns_create_file("binary_runtime_measurements",
+				   S_IRUSR | S_IRGRP, parent, NULL,
+				   &ima_measurements_ops,
+				   &ima_fs_ns_inode_operations,
+				   &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_BINARY_RUNTIME_MEASUREMENTS])) {
+		ns->dentry[IMAFS_DENTRY_BINARY_RUNTIME_MEASUREMENTS] = NULL;
+		goto out;
+	}
+
+	ns->dentry[IMAFS_DENTRY_ASCII_RUNTIME_MEASUREMENTS] =
+	    securityfs_ns_create_file("ascii_runtime_measurements",
+				   S_IRUSR | S_IRGRP, parent, NULL,
+				   &ima_ascii_measurements_ops,
+				   &ima_fs_ns_inode_operations,
+				   &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_ASCII_RUNTIME_MEASUREMENTS])) {
+		ns->dentry[IMAFS_DENTRY_ASCII_RUNTIME_MEASUREMENTS] = NULL;
+		goto out;
+	}
+
+	ns->dentry[IMAFS_DENTRY_RUNTIME_MEASUREMENTS_COUNT] =
+	    securityfs_ns_create_file("runtime_measurements_count",
+				   S_IRUSR | S_IRGRP, parent, NULL,
+				   &ima_measurements_count_ops,
+				   &ima_fs_ns_inode_operations,
+				   &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_RUNTIME_MEASUREMENTS_COUNT])) {
+		ns->dentry[IMAFS_DENTRY_RUNTIME_MEASUREMENTS_COUNT] = NULL;
+		goto out;
+	}
+
+	ns->dentry[IMAFS_DENTRY_VIOLATIONS] =
+	    securityfs_ns_create_file("violations", S_IRUSR | S_IRGRP,
+				   parent, NULL, &ima_htable_violations_ops,
+				   &ima_fs_ns_inode_operations,
+				   &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_VIOLATIONS])) {
+		ns->dentry[IMAFS_DENTRY_VIOLATIONS] = NULL;
+		goto out;
+	}
+
+	ns->dentry[IMAFS_DENTRY_IMA_POLICY] =
+	    securityfs_ns_create_file("policy", POLICY_FILE_FLAGS,
+				   parent, NULL,
+				   &ima_measure_policy_ops,
+				   &ima_fs_ns_inode_operations,
+				   &ns->mount, &ns->mount_count);
+	if (IS_ERR(ns->dentry[IMAFS_DENTRY_IMA_POLICY])) {
+		ns->dentry[IMAFS_DENTRY_IMA_POLICY] = NULL;
+		goto out;
+	}
+
+	/* Adjust the trigger for user namespace's early teardown of dependent
+	 * namespaces. Due to the filesystem there's an additional reference
+	 * to the user namespace.
+	 */
+	ns->user_ns->refcount_teardown += 1;
+
+	return 0;
+
+out:
+	ima_fs_ns_free(ns);
+
+	return -1;
+}
+
+void ima_fs_ns_free(struct ima_namespace *ns)
+{
+	size_t i;
+
+	for (i = 0; i < IMAFS_DENTRY_LAST; i++) {
+		switch (i) {
+		case IMAFS_DENTRY_DIR:
+		case IMAFS_DENTRY_INTEGRITY_DIR:
+			/* files first */
+			continue;
+		}
+		securityfs_ns_remove(ns->dentry[i], &ns->mount, &ns->mount_count);
+		ns->dentry[i] = NULL;
+	}
+	securityfs_ns_remove(ns->dentry[IMAFS_DENTRY_DIR], &ns->mount, &ns->mount_count);
+	ns->dentry[IMAFS_DENTRY_DIR] = NULL;
+	securityfs_ns_remove(ns->dentry[IMAFS_DENTRY_INTEGRITY_DIR], &ns->mount, &ns->mount_count);
+	ns->dentry[IMAFS_DENTRY_INTEGRITY_DIR] = NULL;
+
+	if (ns->mount) {
+		mntput(ns->mount);
+		ns->mount_count -= 1;
+	}
+	ns->mount = NULL;
+}
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 22ff74e85a5f..86a89502c0c5 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -20,6 +20,8 @@
 
 int ima_init_namespace(struct ima_namespace *ns)
 {
+	int rc = 0;
+
 	ns->ns_status_tree = RB_ROOT;
 	rwlock_init(&ns->ns_status_lock);
 	ns->ns_status_cache = KMEM_CACHE(ns_status, SLAB_PANIC);
@@ -52,8 +54,10 @@ int ima_init_namespace(struct ima_namespace *ns)
 	mutex_init(&ns->ima_write_mutex);
 	ns->valid_policy = 1;
 	ns->ima_fs_flags = 0;
+	if (ns != &init_ima_ns)
+		rc = ima_fs_ns_init(ns);
 
-	return 0;
+	return rc;
 }
 
 int __init ima_ns_init(void)
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index e7ad52b79f99..c687e840441a 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -67,7 +67,9 @@ struct ima_namespace *copy_ima_ns(struct ima_namespace *old_ns,
 
 void ima_ns_userns_early_teardown(struct ima_namespace *ns)
 {
-}
+	pr_debug("%s: ns=0x%lx\n", __func__, (unsigned long)ns);
+	ima_fs_ns_free(ns);
+};
 EXPORT_SYMBOL(ima_ns_userns_early_teardown);
 
 static void destroy_ima_ns(struct ima_namespace *ns)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN
  2021-11-30 16:06 ` [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN Stefan Berger
@ 2021-11-30 17:27   ` Casey Schaufler
  2021-11-30 17:41     ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: Casey Schaufler @ 2021-11-30 17:27 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin, Casey Schaufler

On 11/30/2021 8:06 AM, Stefan Berger wrote:
> From: Denis Semakin <denis.semakin@huawei.com>
>
> This patch introduces CAP_INTEGRITY_ADMIN, a new capability that allows
> to setup IMA (Integrity Measurement Architecture) policies per container
> for non-root users.

Why not use CAP_MAC_ADMIN? IMA is a mandatory policy. The scope
is system security administration. It seems to fit your needs.
I introduced CAP_MAC_ADMIN for Smack, and believe that IMA using
it would be completely appropriate.

>
> The main purpose of this new capability is discribed in this document:
> https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
> It is said: "setting the policy should be possibly without the powerful
> CAP_SYS_ADMIN and there should be the opportunity to gate this with a new
> capability CAP_INTEGRITY_ADMIN that allows a user to set the IMA policy
> during container runtime.."
>
> In other words it should be possible to setup IMA policies while not
> giving too many privilges to the user, therefore splitting the
> CAP_INTEGRITY_ADMIN off from CAP_SYS_ADMIN.
>
> Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>   include/linux/capability.h          | 6 ++++++
>   include/uapi/linux/capability.h     | 7 ++++++-
>   security/selinux/include/classmap.h | 4 ++--
>   3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 65efb74c3585..ea6d58acb95e 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -278,4 +278,10 @@ int get_vfs_caps_from_disk(struct user_namespace *mnt_userns,
>   int cap_convert_nscap(struct user_namespace *mnt_userns, struct dentry *dentry,
>   		      const void **ivalue, size_t size);
>   
> +static inline bool integrity_admin_ns_capable(struct user_namespace *ns)
> +{
> +	return ns_capable(ns, CAP_INTEGRITY_ADMIN) ||
> +		ns_capable(ns, CAP_SYS_ADMIN);
> +}
> +
>   #endif /* !_LINUX_CAPABILITY_H */
> diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
> index 463d1ba2232a..48b08e4b3895 100644
> --- a/include/uapi/linux/capability.h
> +++ b/include/uapi/linux/capability.h
> @@ -417,7 +417,12 @@ struct vfs_ns_cap_data {
>   
>   #define CAP_CHECKPOINT_RESTORE	40
>   
> -#define CAP_LAST_CAP         CAP_CHECKPOINT_RESTORE
> +/* Allow setup IMA policy per container independently */
> +/* No necessary to be superuser */
> +
> +#define CAP_INTEGRITY_ADMIN	41
> +
> +#define CAP_LAST_CAP		CAP_INTEGRITY_ADMIN
>   
>   #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
>   
> diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
> index 35aac62a662e..7ff532b90f09 100644
> --- a/security/selinux/include/classmap.h
> +++ b/security/selinux/include/classmap.h
> @@ -28,9 +28,9 @@
>   
>   #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
>   		"wake_alarm", "block_suspend", "audit_read", "perfmon", "bpf", \
> -		"checkpoint_restore"
> +		"checkpoint_restore", "integrity_admin"
>   
> -#if CAP_LAST_CAP > CAP_CHECKPOINT_RESTORE
> +#if CAP_LAST_CAP > CAP_INTEGRITY_ADMIN
>   #error New capability defined, please update COMMON_CAP2_PERMS.
>   #endif
>   

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN
  2021-11-30 17:27   ` Casey Schaufler
@ 2021-11-30 17:41     ` Stefan Berger
  2021-11-30 17:50       ` Casey Schaufler
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-11-30 17:41 UTC (permalink / raw)
  To: Casey Schaufler, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin


On 11/30/21 12:27, Casey Schaufler wrote:
> On 11/30/2021 8:06 AM, Stefan Berger wrote:
>> From: Denis Semakin <denis.semakin@huawei.com>
>>
>> This patch introduces CAP_INTEGRITY_ADMIN, a new capability that allows
>> to setup IMA (Integrity Measurement Architecture) policies per container
>> for non-root users.
>
> Why not use CAP_MAC_ADMIN? IMA is a mandatory policy. The scope
> is system security administration. It seems to fit your needs.
> I introduced CAP_MAC_ADMIN for Smack, and believe that IMA using
> it would be completely appropriate.

Fine by me. I suppose we could be reusing it later on also for setting 
file extended attributes for IMA?

    Stefan



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN
  2021-11-30 17:41     ` Stefan Berger
@ 2021-11-30 17:50       ` Casey Schaufler
  0 siblings, 0 replies; 54+ messages in thread
From: Casey Schaufler @ 2021-11-30 17:50 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin, Casey Schaufler

On 11/30/2021 9:41 AM, Stefan Berger wrote:
>
> On 11/30/21 12:27, Casey Schaufler wrote:
>> On 11/30/2021 8:06 AM, Stefan Berger wrote:
>>> From: Denis Semakin <denis.semakin@huawei.com>
>>>
>>> This patch introduces CAP_INTEGRITY_ADMIN, a new capability that allows
>>> to setup IMA (Integrity Measurement Architecture) policies per container
>>> for non-root users.
>>
>> Why not use CAP_MAC_ADMIN? IMA is a mandatory policy. The scope
>> is system security administration. It seems to fit your needs.
>> I introduced CAP_MAC_ADMIN for Smack, and believe that IMA using
>> it would be completely appropriate.
>
> Fine by me. I suppose we could be reusing it later on also for setting file extended attributes for IMA?

Yes. That would be completely consistent with the intention and the
Smack implementation.

>
>    Stefan
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-11-30 16:06 ` [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability Stefan Berger
@ 2021-12-01 16:58   ` James Bottomley
  2021-12-01 17:35     ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-01 16:58 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin

On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> From: Denis Semakin <denis.semakin@huawei.com>
> 
> Use integrity_admin_ns_capable() to check corresponding capability to
> allow read/write IMA policy without CAP_SYS_ADMIN but with
> CAP_INTEGRITY_ADMIN.
> 
> Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
> ---
>  security/integrity/ima/ima_fs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/security/integrity/ima/ima_fs.c
> b/security/integrity/ima/ima_fs.c
> index fd2798f2d224..6766bb8262f2 100644
> --- a/security/integrity/ima/ima_fs.c
> +++ b/security/integrity/ima/ima_fs.c
> @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode *inode,
> struct file *filp)
>  #else
>  		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
>  			return -EACCES;
> -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> +		if (!integrity_admin_ns_capable(ns->user_ns))

so this one is basically replacing what you did in RFC 16/20, which
seems a little redundant.

The question I'd like to ask is: is there still a reason for needing
CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty much tied
to requiring a user (and a mount, because of securityfs_ns) namespace,
there might not be a pressing need for an admin capability separated
from CAP_SYS_ADMIN because the owner of the user namespace passes the
ns_capable(..., CAP_SYS_ADMIN) check.  The rationale in 

https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations

Is effectively "because CAP_SYS_ADMIN is too powerful" but that's no
longer true of the user namespace owner.  It only passes the ns_capable
() check not the capable() one, so while it does get CAP_SYS_ADMIN, it
can only use it in a few situations which represent quite a power
reduction already.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-01 16:58   ` James Bottomley
@ 2021-12-01 17:35     ` Stefan Berger
  2021-12-01 19:29       ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-01 17:35 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin


On 12/1/21 11:58, James Bottomley wrote:
> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>> From: Denis Semakin <denis.semakin@huawei.com>
>>
>> Use integrity_admin_ns_capable() to check corresponding capability to
>> allow read/write IMA policy without CAP_SYS_ADMIN but with
>> CAP_INTEGRITY_ADMIN.
>>
>> Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
>> ---
>>   security/integrity/ima/ima_fs.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/security/integrity/ima/ima_fs.c
>> b/security/integrity/ima/ima_fs.c
>> index fd2798f2d224..6766bb8262f2 100644
>> --- a/security/integrity/ima/ima_fs.c
>> +++ b/security/integrity/ima/ima_fs.c
>> @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode *inode,
>> struct file *filp)
>>   #else
>>   		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
>>   			return -EACCES;
>> -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
>> +		if (!integrity_admin_ns_capable(ns->user_ns))
> so this one is basically replacing what you did in RFC 16/20, which
> seems a little redundant.
>
> The question I'd like to ask is: is there still a reason for needing
> CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty much tied
> to requiring a user (and a mount, because of securityfs_ns) namespace,
> there might not be a pressing need for an admin capability separated
> from CAP_SYS_ADMIN because the owner of the user namespace passes the
> ns_capable(..., CAP_SYS_ADMIN) check.  The rationale in

Casey suggested using CAP_MAC_ADMIN, which I think would also work.

     CAP_MAC_ADMIN (since Linux 2.6.25)
               Allow MAC configuration or state changes. Implemented for
               the Smack Linux Security Module (LSM).


Down the road I think we should cover setting file extended attributes 
with the same capability as well for when a user signs files or installs 
packages with file signatures.  A container runtime could hold 
CAP_SYS_ADMIN while setting up a container and mounting filesystems and 
drop it for the first process started there. Since we are using the user 
namespace to spawn an IMA namespace, we would then require 
CAP_SYSTEM_ADMIN to be left available so that the user can do IMA 
related stuff in the container (set or append to the policy, write file 
signatures). I am not sure whether that should be the case or rather 
give the user something finer grained, such as CAP_MAC_ADMIN. So, it's 
about granularity...


>
> https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
>
> Is effectively "because CAP_SYS_ADMIN is too powerful" but that's no
> longer true of the user namespace owner.  It only passes the ns_capable
> () check not the capable() one, so while it does get CAP_SYS_ADMIN, it
> can only use it in a few situations which represent quite a power
> reduction already.

At least docker containers drop CAP_SYS_ADMIN. I am not sure what the 
decision was based on but probably they don't want to give the user what 
is not absolutely necessary, but usage of user namespaces (with IMA 
namespaces) would kind of force it to be available then to do 
IMA-related stuff ...

Following this man page here 
https://man7.org/linux/man-pages/man7/user_namespaces.7.html

CAP_SYS_ADMIN in a user namespace is about

- bind-mounting filesystems

- mounting /proc filesystems

- creating nested user namespaces

- configuring UTS namespace

- configuring whether setgroups() can be used

- usage of setns()


Do we want to add '- only way of *setting up* IMA related stuff' to this 
list?

   Stefan


>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-11-30 16:06 ` [RFC 20/20] ima: Setup securityfs_ns " Stefan Berger
@ 2021-12-01 17:56   ` James Bottomley
  2021-12-01 18:11     ` Stefan Berger
  2021-12-02 13:18   ` Christian Brauner
  1 sibling, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-01 17:56 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
[...]
> +
> +/*
> + * Fix the ownership (uid/gid) of the dentry's that couldn't be set
> at the
> + * time of their creation because the user namespace wasn't
> configured, yet.
> + */
> +static void ima_fs_ns_fixup_uid_gid(struct ima_namespace *ns)
> +{
> +	struct inode *inode;
> +	size_t i;
> +
> +	if (ns->file_ownership_fixes_done ||
> +	    ns->user_ns->uid_map.nr_extents == 0)
> +		return;
> +
> +	ns->file_ownership_fixes_done = true;
> +	for (i = 0; i < IMAFS_DENTRY_LAST; i++) {
> +		if (!ns->dentry[i])
> +			continue;
> +		inode = ns->dentry[i]->d_inode;
> +		inode->i_uid = make_kuid(ns->user_ns, 0);
> +		inode->i_gid = make_kgid(ns->user_ns, 0);
> +	}
> +}
> +
> +/* Fix the permissions when a file is opened */
> +int ima_fs_ns_permission(struct user_namespace *mnt_userns, struct
> inode *inode,
> +			 int mask)
> +{
> +	ima_fs_ns_fixup_uid_gid(get_current_ns());
> +	return generic_permission(mnt_userns, inode, mask);
> +}
> +
> +const struct inode_operations ima_fs_ns_inode_operations = {
> +	.lookup		= simple_lookup,
> +	.permission	= ima_fs_ns_permission,
> +};
> +

In theory this uid/gid shifting should have already been done for you
and all of the above code should be unnecessary.  What is supposed to
happen is that the mount of securityfs_ns in the new user namespace
should pick up a superblock s_user_ns for that new user namespace.  Now
inode_alloc() uses i_uid_write(inode, 0) which maps back through the
s_user_ns to obtain the owner of the user namespace.

What can happen is that if you do the inode allocation before (or even
without) writing to the uid_map file, it maps back through an empty map
and ends up with -1 for i_uid ... is this what you're seeing?

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 17:56   ` James Bottomley
@ 2021-12-01 18:11     ` Stefan Berger
  2021-12-01 19:21       ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-01 18:11 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/1/21 12:56, James Bottomley wrote:
> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> [...]
>> +
>> +/*
>> + * Fix the ownership (uid/gid) of the dentry's that couldn't be set
>> at the
>> + * time of their creation because the user namespace wasn't
>> configured, yet.
>> + */
>> +static void ima_fs_ns_fixup_uid_gid(struct ima_namespace *ns)
>> +{
>> +	struct inode *inode;
>> +	size_t i;
>> +
>> +	if (ns->file_ownership_fixes_done ||
>> +	    ns->user_ns->uid_map.nr_extents == 0)
>> +		return;
>> +
>> +	ns->file_ownership_fixes_done = true;
>> +	for (i = 0; i < IMAFS_DENTRY_LAST; i++) {
>> +		if (!ns->dentry[i])
>> +			continue;
>> +		inode = ns->dentry[i]->d_inode;
>> +		inode->i_uid = make_kuid(ns->user_ns, 0);
>> +		inode->i_gid = make_kgid(ns->user_ns, 0);
>> +	}
>> +}
>> +
>> +/* Fix the permissions when a file is opened */
>> +int ima_fs_ns_permission(struct user_namespace *mnt_userns, struct
>> inode *inode,
>> +			 int mask)
>> +{
>> +	ima_fs_ns_fixup_uid_gid(get_current_ns());
>> +	return generic_permission(mnt_userns, inode, mask);
>> +}
>> +
>> +const struct inode_operations ima_fs_ns_inode_operations = {
>> +	.lookup		= simple_lookup,
>> +	.permission	= ima_fs_ns_permission,
>> +};
>> +
> In theory this uid/gid shifting should have already been done for you
> and all of the above code should be unnecessary.  What is supposed to
> happen is that the mount of securityfs_ns in the new user namespace
> should pick up a superblock s_user_ns for that new user namespace.  Now
> inode_alloc() uses i_uid_write(inode, 0) which maps back through the
> s_user_ns to obtain the owner of the user namespace.
>
> What can happen is that if you do the inode allocation before (or even
> without) writing to the uid_map file, it maps back through an empty map
> and ends up with -1 for i_uid ... is this what you're seeing?

I tried this with runc and a user namespace active mapping uid 1000 on 
the host to uid 0 in the container. There I run into the problem that 
all of the files and directories without the above work-around are 
mapped to 'nobody', just like all the files in sysfs in this case are 
also mapped to nobody. This code resolved the issue.


sh-5.1# ls -l /sys/
total 0
drwxr-xr-x.   2 nobody nobody  0 Dec  1 18:06 block
drwxr-xr-x.  28 nobody nobody  0 Dec  1 18:06 bus
drwxr-xr-x.  54 nobody nobody  0 Dec  1 18:06 class
drwxr-xr-x.   4 nobody nobody  0 Dec  1 18:06 dev
drwxr-xr-x.  15 nobody nobody  0 Dec  1 18:06 devices
drwxrwxrwt.   2 root   root   40 Dec  1 18:06 firmware
drwxr-xr-x.   9 nobody nobody  0 Dec  1 18:06 fs
drwxr-xr-x.  16 nobody nobody  0 Dec  1 18:06 kernel
drwxr-xr-x. 161 nobody nobody  0 Dec  1 18:06 module
drwxr-xr-x.   3 nobody nobody  0 Dec  1 18:06 power

sh-5.1# ls -l /sys/kernel/security/
total 0
lr--r--r--. 1 nobody nobody 0 Dec  1 18:06 ima -> integrity/ima
drwxr-xr-x. 3 nobody nobody 0 Dec  1 18:06 integrity

sh-5.1# ls -l /sys/kernel/security/ima/
total 0
-r--r-----. 1 root root 0 Dec  1 18:06 ascii_runtime_measurements
-r--r-----. 1 root root 0 Dec  1 18:06 binary_runtime_measurements
-rw-------. 1 root root 0 Dec  1 18:06 policy
-r--r-----. 1 root root 0 Dec  1 18:06 runtime_measurements_count
-r--r-----. 1 root root 0 Dec  1 18:06 violations

The nobody's are obviously sufficient to cd into the directories, but 
for file accesses I wanted to see root and no changes to permissions.

     Stefan

>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 18:11     ` Stefan Berger
@ 2021-12-01 19:21       ` James Bottomley
  2021-12-01 20:25         ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-01 19:21 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
> On 12/1/21 12:56, James Bottomley wrote:
[...]
> I tried this with runc and a user namespace active mapping uid 1000
> on the host to uid 0 in the container. There I run into the problem
> that  all of the files and directories without the above work-around
> are mapped to 'nobody', just like all the files in sysfs in this case
> are also mapped to nobody. This code resolved the issue.

So I applied your patches with the permission shift commented out and
instrumented inode_alloc() to see where it might be failing and I
actually find it all works as expected for me:

ejb@testdeb:~> unshare -r --user --mount --ima
root@testdeb:~# mount -t securityfs_ns none /sys/kernel/security
root@testdeb:~# ls -l /sys/kernel/security/ima/
total 0
-r--r----- 1 root root 0 Dec  1 19:11 ascii_runtime_measurements
-r--r----- 1 root root 0 Dec  1 19:11 binary_runtime_measurements
-rw------- 1 root root 0 Dec  1 19:11 policy
-r--r----- 1 root root 0 Dec  1 19:11 runtime_measurements_count
-r--r----- 1 root root 0 Dec  1 19:11 violations

I think your problem is something to do with how runc is installing the
uid/gid mappings.  If it's installing them after the security_ns inodes
are created then they get the -1 value (because no mappings exist in
s_user_ns).  I can even demonstrate this by forcing unshare to enter
the IMA namespace before writing the mapping values and I'll see
"nobody nogroup" above like you do.

I also see the instrumentation telling me that i_write_uid() is mapping
back to 1000 in the former case and -1 in the latter.

James





^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-01 17:35     ` Stefan Berger
@ 2021-12-01 19:29       ` James Bottomley
  2021-12-02  7:16         ` Denis Semakin
  2021-12-02 12:59         ` Christian Brauner
  0 siblings, 2 replies; 54+ messages in thread
From: James Bottomley @ 2021-12-01 19:29 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris, Denis Semakin

On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
> On 12/1/21 11:58, James Bottomley wrote:
> > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > From: Denis Semakin <denis.semakin@huawei.com>
> > > 
> > > Use integrity_admin_ns_capable() to check corresponding
> > > capability to allow read/write IMA policy without CAP_SYS_ADMIN
> > > but with CAP_INTEGRITY_ADMIN.
> > > 
> > > Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
> > > ---
> > >   security/integrity/ima/ima_fs.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/security/integrity/ima/ima_fs.c
> > > b/security/integrity/ima/ima_fs.c
> > > index fd2798f2d224..6766bb8262f2 100644
> > > --- a/security/integrity/ima/ima_fs.c
> > > +++ b/security/integrity/ima/ima_fs.c
> > > @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
> > > *inode,
> > > struct file *filp)
> > >   #else
> > >   		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
> > >   			return -EACCES;
> > > -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> > > +		if (!integrity_admin_ns_capable(ns->user_ns))
> > so this one is basically replacing what you did in RFC 16/20, which
> > seems a little redundant.
> > 
> > The question I'd like to ask is: is there still a reason for
> > needing CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty
> > much tied to requiring a user (and a mount, because of
> > securityfs_ns) namespace, there might not be a pressing need for an
> > admin capability separated from CAP_SYS_ADMIN because the owner of
> > the user namespace passes the ns_capable(..., CAP_SYS_ADMIN)
> > check.  The rationale in
> 
> Casey suggested using CAP_MAC_ADMIN, which I think would also work.
> 
>      CAP_MAC_ADMIN (since Linux 2.6.25)
>                Allow MAC configuration or state changes. Implemented
> for
>                the Smack Linux Security Module (LSM).
> 
> 
> Down the road I think we should cover setting file extended
> attributes with the same capability as well for when a user signs
> files or installs packages with file signatures.  A container runtime
> could hold CAP_SYS_ADMIN while setting up a container and mounting
> filesystems and drop it for the first process started there. Since we
> are using the user namespace to spawn an IMA namespace, we would then
> require CAP_SYSTEM_ADMIN to be left available so that the user can do
> IMA related stuff in the container (set or append to the policy,
> write file signatures). I am not sure whether that should be the case
> or rather give the user something finer grained, such as
> CAP_MAC_ADMIN. So, it's about granularity...

It's possible ... any orchestration system that doesn't enter a user
namespace has to strictly regulate capabilities.   I'm probably biased
because I always use a user_ns so I never really had to mess with
capabilities.

> > https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
> > 
> > Is effectively "because CAP_SYS_ADMIN is too powerful" but that's
> > no longer true of the user namespace owner.  It only passes the
> > ns_capable() check not the capable() one, so while it does get
> > CAP_SYS_ADMIN, it can only use it in a few situations which
> > represent quite a power reduction already.
> 
> At least docker containers drop CAP_SYS_ADMIN.

Well docker doesn't use the user_ns.  But even given that,
CAP_SYS_ADMIN is always dropped for most container systems.  What
happens when you enter a user namespace is the ns_capable( ...,
CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns,
in the same way it would for root.  So effectively entering a user
namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what
unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that
responds only in the places in the kernel that have a ns_capable()
check instead of a capable() one (most of the places you list below). 
This is the principle of how unprivileged containers actually work ...
and the source of some of our security problems if you get back an
ability to do something you shouldn't be allowed to do as an
unprivileged user.

>  I am not sure what the decision was based on but probably they don't
> want to give the user what is not absolutely necessary, but usage of
> user namespaces (with IMA namespaces) would kind of force it to be
> available then to do IMA-related stuff ...
> 
> Following this man page here 
> https://man7.org/linux/man-pages/man7/user_namespaces.7.html
> 
> CAP_SYS_ADMIN in a user namespace is about
> 
> - bind-mounting filesystems
> 
> - mounting /proc filesystems
> 
> - creating nested user namespaces
> 
> - configuring UTS namespace
> 
> - configuring whether setgroups() can be used
> 
> - usage of setns()
> 
> 
> Do we want to add '- only way of *setting up* IMA related stuff' to
> this list?

I don't see why not, but other container people should weigh in
because, as I said, I mostly use the user namespace and unprivileged
containers and don't bother with capabilities.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 19:21       ` James Bottomley
@ 2021-12-01 20:25         ` Stefan Berger
  2021-12-01 21:11           ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-01 20:25 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/1/21 14:21, James Bottomley wrote:
> On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
>> On 12/1/21 12:56, James Bottomley wrote:
> [...]
>> I tried this with runc and a user namespace active mapping uid 1000
>> on the host to uid 0 in the container. There I run into the problem
>> that  all of the files and directories without the above work-around
>> are mapped to 'nobody', just like all the files in sysfs in this case
>> are also mapped to nobody. This code resolved the issue.
> So I applied your patches with the permission shift commented out and
> instrumented inode_alloc() to see where it might be failing and I
> actually find it all works as expected for me:
>
> ejb@testdeb:~> unshare -r --user --mount --ima
> root@testdeb:~# mount -t securityfs_ns none /sys/kernel/security
> root@testdeb:~# ls -l /sys/kernel/security/ima/
> total 0
> -r--r----- 1 root root 0 Dec  1 19:11 ascii_runtime_measurements
> -r--r----- 1 root root 0 Dec  1 19:11 binary_runtime_measurements
> -rw------- 1 root root 0 Dec  1 19:11 policy
> -r--r----- 1 root root 0 Dec  1 19:11 runtime_measurements_count
> -r--r----- 1 root root 0 Dec  1 19:11 violations
>
> I think your problem is something to do with how runc is installing the
> uid/gid mappings.  If it's installing them after the security_ns inodes
> are created then they get the -1 value (because no mappings exist in
> s_user_ns).  I can even demonstrate this by forcing unshare to enter
> the IMA namespace before writing the mapping values and I'll see
> "nobody nogroup" above like you do.

I am surprised you get this mapping even after commenting the permission 
adjustments... it doesn't work for me when I comment them out:

[stefanb@ima-ns-dev rootfs]$ unshare -r --user --mount
[root@ima-ns-dev rootfs]# mount -t securityfs_ns none /sys/kernel/security/
[root@ima-ns-dev rootfs]# cd /sys/kernel/security/ima/
[root@ima-ns-dev ima]# ls -l
total 0
-r--r-----. 1 nobody nobody 0 Dec  1 15:20 ascii_runtime_measurements
-r--r-----. 1 nobody nobody 0 Dec  1 15:20 binary_runtime_measurements
-rw-------. 1 nobody nobody 0 Dec  1 15:20 policy
-r--r-----. 1 nobody nobody 0 Dec  1 15:20 runtime_measurements_count
-r--r-----. 1 nobody nobody 0 Dec  1 15:20 violations
[root@ima-ns-dev ima]# cat /proc/self/uid_map
          0       1000          1
[root@ima-ns-dev ima]# cat /proc/self/gid_map
          0       1000          1

The initialization of securityfs and setup of files and directories 
happens at the same time as the IMA namespace is created. At this time 
there are no user mappings available, so that's why I need to make the 
adjustments 'late'.

    Stefan



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 20:25         ` Stefan Berger
@ 2021-12-01 21:11           ` James Bottomley
  2021-12-01 21:34             ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-01 21:11 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Wed, 2021-12-01 at 15:25 -0500, Stefan Berger wrote:
> On 12/1/21 14:21, James Bottomley wrote:
> > On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
> > > On 12/1/21 12:56, James Bottomley wrote:
> > [...]
> > > I tried this with runc and a user namespace active mapping uid
> > > 1000 on the host to uid 0 in the container. There I run into the
> > > problem that  all of the files and directories without the above
> > > work-around are mapped to 'nobody', just like all the files in
> > > sysfs in this case are also mapped to nobody. This code resolved
> > > the issue.
> > So I applied your patches with the permission shift commented out
> > and instrumented inode_alloc() to see where it might be failing and
> > I actually find it all works as expected for me:
> > 
> > ejb@testdeb:~> unshare -r --user --mount --ima
> > root@testdeb:~# mount -t securityfs_ns none /sys/kernel/security
> > root@testdeb:~# ls -l /sys/kernel/security/ima/
> > total 0
> > -r--r----- 1 root root 0 Dec  1 19:11 ascii_runtime_measurements
> > -r--r----- 1 root root 0 Dec  1 19:11 binary_runtime_measurements
> > -rw------- 1 root root 0 Dec  1 19:11 policy
> > -r--r----- 1 root root 0 Dec  1 19:11 runtime_measurements_count
> > -r--r----- 1 root root 0 Dec  1 19:11 violations
> > 
> > I think your problem is something to do with how runc is installing
> > the uid/gid mappings.  If it's installing them after the
> > security_ns inodes are created then they get the -1 value (because
> > no mappings exist in s_user_ns).  I can even demonstrate this by
> > forcing unshare to enter the IMA namespace before writing the
> > mapping values and I'll see "nobody nogroup" above like you do.
> 
> I am surprised you get this mapping even after commenting the
> permission adjustments... it doesn't work for me when I comment them
> out:
> 
> [stefanb@ima-ns-dev rootfs]$ unshare -r --user --mount
> [root@ima-ns-dev rootfs]# mount -t securityfs_ns none
> /sys/kernel/security/
> [root@ima-ns-dev rootfs]# cd /sys/kernel/security/ima/
> [root@ima-ns-dev ima]# ls -l
> total 0
> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 ascii_runtime_measurements
> -r--r-----. 1 nobody nobody 0 Dec  1 15:20
> binary_runtime_measurements
> -rw-------. 1 nobody nobody 0 Dec  1 15:20 policy
> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 runtime_measurements_count
> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 violations
> [root@ima-ns-dev ima]# cat /proc/self/uid_map
>           0       1000          1
> [root@ima-ns-dev ima]# cat /proc/self/gid_map
>           0       1000          1
> 
> The initialization of securityfs and setup of files and directories 
> happens at the same time as the IMA namespace is created. At this
> time there are no user mappings available, so that's why I need to
> make the adjustments 'late'.

There is one other possible difference:  To get the correct s_user_ns
on the securityfs_ns mount, the mount namespace itself has to be owned
by the user namespace ... is runc doing that correctly?  I always
forget this detail because unshare does it correctly automatically but
it means you must unshare the user namespace first and then unshare the
mount namespace (or do it in the same sys call because the kernel will
get the correct order).

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 21:11           ` James Bottomley
@ 2021-12-01 21:34             ` Stefan Berger
  2021-12-01 22:01               ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-01 21:34 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/1/21 16:11, James Bottomley wrote:
> On Wed, 2021-12-01 at 15:25 -0500, Stefan Berger wrote:
>> On 12/1/21 14:21, James Bottomley wrote:
>>> On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
>>>> On 12/1/21 12:56, James Bottomley wrote:
>>> [...]
>>>> I tried this with runc and a user namespace active mapping uid
>>>> 1000 on the host to uid 0 in the container. There I run into the
>>>> problem that  all of the files and directories without the above
>>>> work-around are mapped to 'nobody', just like all the files in
>>>> sysfs in this case are also mapped to nobody. This code resolved
>>>> the issue.
>>> So I applied your patches with the permission shift commented out
>>> and instrumented inode_alloc() to see where it might be failing and
>>> I actually find it all works as expected for me:
>>>
>>> ejb@testdeb:~> unshare -r --user --mount --ima
>>> root@testdeb:~# mount -t securityfs_ns none /sys/kernel/security
>>> root@testdeb:~# ls -l /sys/kernel/security/ima/
>>> total 0
>>> -r--r----- 1 root root 0 Dec  1 19:11 ascii_runtime_measurements
>>> -r--r----- 1 root root 0 Dec  1 19:11 binary_runtime_measurements
>>> -rw------- 1 root root 0 Dec  1 19:11 policy
>>> -r--r----- 1 root root 0 Dec  1 19:11 runtime_measurements_count
>>> -r--r----- 1 root root 0 Dec  1 19:11 violations
>>>
>>> I think your problem is something to do with how runc is installing
>>> the uid/gid mappings.  If it's installing them after the
>>> security_ns inodes are created then they get the -1 value (because
>>> no mappings exist in s_user_ns).  I can even demonstrate this by
>>> forcing unshare to enter the IMA namespace before writing the
>>> mapping values and I'll see "nobody nogroup" above like you do.
>> I am surprised you get this mapping even after commenting the
>> permission adjustments... it doesn't work for me when I comment them
>> out:
>>
>> [stefanb@ima-ns-dev rootfs]$ unshare -r --user --mount
>> [root@ima-ns-dev rootfs]# mount -t securityfs_ns none
>> /sys/kernel/security/
>> [root@ima-ns-dev rootfs]# cd /sys/kernel/security/ima/
>> [root@ima-ns-dev ima]# ls -l
>> total 0
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 ascii_runtime_measurements
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20
>> binary_runtime_measurements
>> -rw-------. 1 nobody nobody 0 Dec  1 15:20 policy
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 runtime_measurements_count
>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 violations
>> [root@ima-ns-dev ima]# cat /proc/self/uid_map
>>            0       1000          1
>> [root@ima-ns-dev ima]# cat /proc/self/gid_map
>>            0       1000          1
>>
>> The initialization of securityfs and setup of files and directories
>> happens at the same time as the IMA namespace is created. At this
>> time there are no user mappings available, so that's why I need to
>> make the adjustments 'late'.
> There is one other possible difference:  To get the correct s_user_ns

I am currently wondering why I cannot re-create your setup while 
disabling the remapping...




> on the securityfs_ns mount, the mount namespace itself has to be owned
> by the user namespace ... is runc doing that correctly?  I always

Following an strace of 'runc create' I see an unshare(CLONE_NEWUSER) by 
a process before it does an 
unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWNET), 
so this seems to be doing it in the order you suggest.

Also, runc seems to have its own set of struggles. I am not sure we 
would be able to ask them to accommodate us to do it 'correctly' - it 
doesn't sound so 'easy' for them either to get everything under the hood:

https://github.com/opencontainers/runc/blob/master/libcontainer/nsenter/nsexec.c#L919

      * In order for this unsharing code to be more extensible we need 
to split
      * up unshare(CLONE_NEWUSER) and clone() in various ways. The ideal 
case
      * would be if we did clone(CLONE_NEWUSER) and the other namespaces
      * separately, but because of SELinux issues we cannot really do 
that. But

[...]

      * However, if we unshare(2) the user namespace *before* we 
clone(2), then
      * all hell breaks loose.

sounds like fun

So, I am not quite sure whether I am working around an issue of runc but 
for that I would like to first be able to re-create your successful 
setup to see what's different.

    Stefan


> forget this detail because unshare does it correctly automatically but
> it means you must unshare the user namespace first and then unshare the
> mount namespace (or do it in the same sys call because the kernel will
> get the correct order).
>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 21:34             ` Stefan Berger
@ 2021-12-01 22:01               ` James Bottomley
  2021-12-01 22:09                 ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-01 22:01 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Wed, 2021-12-01 at 16:34 -0500, Stefan Berger wrote:
> On 12/1/21 16:11, James Bottomley wrote:
> > On Wed, 2021-12-01 at 15:25 -0500, Stefan Berger wrote:
> > > On 12/1/21 14:21, James Bottomley wrote:
> > > > On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
> > > > > On 12/1/21 12:56, James Bottomley wrote:
> > > > [...]
> > > > > I tried this with runc and a user namespace active mapping
> > > > > uid
> > > > > 1000 on the host to uid 0 in the container. There I run into
> > > > > the
> > > > > problem that  all of the files and directories without the
> > > > > above
> > > > > work-around are mapped to 'nobody', just like all the files
> > > > > in
> > > > > sysfs in this case are also mapped to nobody. This code
> > > > > resolved
> > > > > the issue.
> > > > So I applied your patches with the permission shift commented
> > > > out
> > > > and instrumented inode_alloc() to see where it might be failing
> > > > and
> > > > I actually find it all works as expected for me:
> > > > 
> > > > ejb@testdeb:~> unshare -r --user --mount --ima
> > > > root@testdeb:~# mount -t securityfs_ns none
> > > > /sys/kernel/security
> > > > root@testdeb:~# ls -l /sys/kernel/security/ima/
> > > > total 0
> > > > -r--r----- 1 root root 0 Dec  1 19:11
> > > > ascii_runtime_measurements
> > > > -r--r----- 1 root root 0 Dec  1 19:11
> > > > binary_runtime_measurements
> > > > -rw------- 1 root root 0 Dec  1 19:11 policy
> > > > -r--r----- 1 root root 0 Dec  1 19:11
> > > > runtime_measurements_count
> > > > -r--r----- 1 root root 0 Dec  1 19:11 violations
> > > > 
> > > > I think your problem is something to do with how runc is
> > > > installing
> > > > the uid/gid mappings.  If it's installing them after the
> > > > security_ns inodes are created then they get the -1 value
> > > > (because
> > > > no mappings exist in s_user_ns).  I can even demonstrate this
> > > > by
> > > > forcing unshare to enter the IMA namespace before writing the
> > > > mapping values and I'll see "nobody nogroup" above like you do.
> > > I am surprised you get this mapping even after commenting the
> > > permission adjustments... it doesn't work for me when I comment
> > > them
> > > out:
> > > 
> > > [stefanb@ima-ns-dev rootfs]$ unshare -r --user --mount
> > > [root@ima-ns-dev rootfs]# mount -t securityfs_ns none
> > > /sys/kernel/security/
> > > [root@ima-ns-dev rootfs]# cd /sys/kernel/security/ima/
> > > [root@ima-ns-dev ima]# ls -l
> > > total 0
> > > -r--r-----. 1 nobody nobody 0 Dec  1 15:20
> > > ascii_runtime_measurements
> > > -r--r-----. 1 nobody nobody 0 Dec  1 15:20
> > > binary_runtime_measurements
> > > -rw-------. 1 nobody nobody 0 Dec  1 15:20 policy
> > > -r--r-----. 1 nobody nobody 0 Dec  1 15:20
> > > runtime_measurements_count
> > > -r--r-----. 1 nobody nobody 0 Dec  1 15:20 violations
> > > [root@ima-ns-dev ima]# cat /proc/self/uid_map
> > >            0       1000          1
> > > [root@ima-ns-dev ima]# cat /proc/self/gid_map
> > >            0       1000          1
> > > 
> > > The initialization of securityfs and setup of files and
> > > directories
> > > happens at the same time as the IMA namespace is created. At this
> > > time there are no user mappings available, so that's why I need
> > > to
> > > make the adjustments 'late'.
> > There is one other possible difference:  To get the correct
> > s_user_ns
> 
> I am currently wondering why I cannot re-create your setup while 
> disabling the remapping...

OK, I think I figured it out.  When I applied your patches, it was on
top of my existing ones, so I had to massage them a bit.

Your problem is the securityfs inode creation is triggered inside
create_user_ns, which means it happens *before* ushare writes to the
proc/self/uid_map file, so the securityfs_inodes are always created on
an empty mapping and i_write_uid always sets the inode uid to -1.

I don't see this because my setup for everything is triggered off the
first use of the IMA namespace.  You'd need to have some type of lazy
setup of the inodes as well to give unshare time to install the uid/gid
mappings.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 22:01               ` James Bottomley
@ 2021-12-01 22:09                 ` Stefan Berger
  2021-12-01 22:19                   ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-01 22:09 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/1/21 17:01, James Bottomley wrote:
> On Wed, 2021-12-01 at 16:34 -0500, Stefan Berger wrote:
>> On 12/1/21 16:11, James Bottomley wrote:
>>> On Wed, 2021-12-01 at 15:25 -0500, Stefan Berger wrote:
>>>> On 12/1/21 14:21, James Bottomley wrote:
>>>>> On Wed, 2021-12-01 at 13:11 -0500, Stefan Berger wrote:
>>>>>> On 12/1/21 12:56, James Bottomley wrote:
>>>>> [...]
>>>>>> I tried this with runc and a user namespace active mapping
>>>>>> uid
>>>>>> 1000 on the host to uid 0 in the container. There I run into
>>>>>> the
>>>>>> problem that  all of the files and directories without the
>>>>>> above
>>>>>> work-around are mapped to 'nobody', just like all the files
>>>>>> in
>>>>>> sysfs in this case are also mapped to nobody. This code
>>>>>> resolved
>>>>>> the issue.
>>>>> So I applied your patches with the permission shift commented
>>>>> out
>>>>> and instrumented inode_alloc() to see where it might be failing
>>>>> and
>>>>> I actually find it all works as expected for me:
>>>>>
>>>>> ejb@testdeb:~> unshare -r --user --mount --ima
>>>>> root@testdeb:~# mount -t securityfs_ns none
>>>>> /sys/kernel/security
>>>>> root@testdeb:~# ls -l /sys/kernel/security/ima/
>>>>> total 0
>>>>> -r--r----- 1 root root 0 Dec  1 19:11
>>>>> ascii_runtime_measurements
>>>>> -r--r----- 1 root root 0 Dec  1 19:11
>>>>> binary_runtime_measurements
>>>>> -rw------- 1 root root 0 Dec  1 19:11 policy
>>>>> -r--r----- 1 root root 0 Dec  1 19:11
>>>>> runtime_measurements_count
>>>>> -r--r----- 1 root root 0 Dec  1 19:11 violations
>>>>>
>>>>> I think your problem is something to do with how runc is
>>>>> installing
>>>>> the uid/gid mappings.  If it's installing them after the
>>>>> security_ns inodes are created then they get the -1 value
>>>>> (because
>>>>> no mappings exist in s_user_ns).  I can even demonstrate this
>>>>> by
>>>>> forcing unshare to enter the IMA namespace before writing the
>>>>> mapping values and I'll see "nobody nogroup" above like you do.
>>>> I am surprised you get this mapping even after commenting the
>>>> permission adjustments... it doesn't work for me when I comment
>>>> them
>>>> out:
>>>>
>>>> [stefanb@ima-ns-dev rootfs]$ unshare -r --user --mount
>>>> [root@ima-ns-dev rootfs]# mount -t securityfs_ns none
>>>> /sys/kernel/security/
>>>> [root@ima-ns-dev rootfs]# cd /sys/kernel/security/ima/
>>>> [root@ima-ns-dev ima]# ls -l
>>>> total 0
>>>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20
>>>> ascii_runtime_measurements
>>>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20
>>>> binary_runtime_measurements
>>>> -rw-------. 1 nobody nobody 0 Dec  1 15:20 policy
>>>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20
>>>> runtime_measurements_count
>>>> -r--r-----. 1 nobody nobody 0 Dec  1 15:20 violations
>>>> [root@ima-ns-dev ima]# cat /proc/self/uid_map
>>>>             0       1000          1
>>>> [root@ima-ns-dev ima]# cat /proc/self/gid_map
>>>>             0       1000          1
>>>>
>>>> The initialization of securityfs and setup of files and
>>>> directories
>>>> happens at the same time as the IMA namespace is created. At this
>>>> time there are no user mappings available, so that's why I need
>>>> to
>>>> make the adjustments 'late'.
>>> There is one other possible difference:  To get the correct
>>> s_user_ns
>> I am currently wondering why I cannot re-create your setup while
>> disabling the remapping...
> OK, I think I figured it out.  When I applied your patches, it was on
> top of my existing ones, so I had to massage them a bit.
>
> Your problem is the securityfs inode creation is triggered inside
> create_user_ns, which means it happens *before* ushare writes to the
> proc/self/uid_map file, so the securityfs_inodes are always created on
> an empty mapping and i_write_uid always sets the inode uid to -1.

Right, the initialization of the filesystem is quite early.


>
> I don't see this because my setup for everything is triggered off the
> first use of the IMA namespace.  You'd need to have some type of lazy
> setup of the inodes as well to give unshare time to install the uid/gid
> mappings.

What could trigger that? A callback while mounting - but I am not sure 
where to hook into then. What is your mechanisms to trigger as the 
'first use of the IMA namespace'? What is 'use' here?

    Stefan



>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 22:09                 ` Stefan Berger
@ 2021-12-01 22:19                   ` James Bottomley
  2021-12-02  0:02                     ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-01 22:19 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Wed, 2021-12-01 at 17:09 -0500, Stefan Berger wrote:
> On 12/1/21 17:01, James Bottomley wrote:
[...]
> > OK, I think I figured it out.  When I applied your patches, it was
> > on top of my existing ones, so I had to massage them a bit.
> > 
> > Your problem is the securityfs inode creation is triggered inside
> > create_user_ns, which means it happens *before* ushare writes to
> > the proc/self/uid_map file, so the securityfs_inodes are always
> > created on an empty mapping and i_write_uid always sets the inode
> > uid to -1.
> 
> Right, the initialization of the filesystem is quite early.
> 
> 
> > I don't see this because my setup for everything is triggered off
> > the first use of the IMA namespace.  You'd need to have some type
> > of lazy setup of the inodes as well to give unshare time to install
> > the uid/gidmappings.
> 
> What could trigger that? A callback while mounting - but I am not
> sure where to hook into then. What is your mechanisms to trigger as
> the 'first use of the IMA namespace'? What is 'use' here?

use for me is first event that gets logged in the new namespace.

However, I don't think this is a good trigger, it's just a random thing
I was playing with.  Perhaps trigger on mount is a good one ... that
could be done from securityfs_ns_init_fs_context?

James





^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-01 22:19                   ` James Bottomley
@ 2021-12-02  0:02                     ` Stefan Berger
  0 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-12-02  0:02 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/1/21 17:19, James Bottomley wrote:
> On Wed, 2021-12-01 at 17:09 -0500, Stefan Berger wrote:
>> On 12/1/21 17:01, James Bottomley wrote:
>>
>>
>>
>>> I don't see this because my setup for everything is triggered off
>>> the first use of the IMA namespace.  You'd need to have some type
>>> of lazy setup of the inodes as well to give unshare time to install
>>> the uid/gidmappings.
>> What could trigger that? A callback while mounting - but I am not
>> sure where to hook into then. What is your mechanisms to trigger as
>> the 'first use of the IMA namespace'? What is 'use' here?
> use for me is first event that gets logged in the new namespace.
>
> However, I don't think this is a good trigger, it's just a random thing
> I was playing with.  Perhaps trigger on mount is a good one ... that
> could be done from securityfs_ns_init_fs_context?

Yes, this here does the trick now for late init also with runc. The late 
uid adjustments are gone.

static int securityfs_ns_init_fs_context(struct fs_context *fc)
{
         int rc;

         if (fc->user_ns->ima_ns->late_fs_init) {
                 rc = fc->user_ns->ima_ns->late_fs_init(fc->user_ns);
                 if (rc)
                         return rc;
         }
         fc->ops = &securityfs_ns_context_ops;
         return 0;
}


    Stefan


>
> James
>
>
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-01 19:29       ` James Bottomley
@ 2021-12-02  7:16         ` Denis Semakin
  2021-12-02 12:33           ` James Bottomley
  2021-12-02 17:54           ` Stefan Berger
  2021-12-02 12:59         ` Christian Brauner
  1 sibling, 2 replies; 54+ messages in thread
From: Denis Semakin @ 2021-12-02  7:16 UTC (permalink / raw)
  To: jejb, Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, Krzysztof Struczynski, Roberto Sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

Obviously the main goal by adding new capability was to avoid the using CAP_SYS_ADMIN (IOW superuser)
to manage IMA stuff, that was also about security granularity.  It's good if CAP_MAC_ADMIN will be enough for doing IMA related things (write policies and extended attributes).
But for me it's a little bit unclear how to deal with unprivileged users: assuming there's no CAP_INTEGRITY_ADMIN but CAP_MAC_ADMIN was set up, so in this case user can control any LSM (seLinux, SMACK, etc) and IMA (policies, xattrs). What if .. for some systems there would be some requirements that will allow to touch LSM but do not change any IMA (integrity) things? A user can set up any IMA policy (it's about the system integrity), modify IMA related xattrs but it's forbidden to change seLinux policies and e.g. SMACK labels... May be it's unreal scenario of course... but I guess it's not 100% impossible.

Best regards,
Denis


-----Original Message-----
From: James Bottomley [mailto:jejb@linux.ibm.com] 
Sent: Wednesday, December 1, 2021 10:29 PM
To: Stefan Berger <stefanb@linux.ibm.com>; linux-integrity@vger.kernel.org
Cc: zohar@linux.ibm.com; serge@hallyn.com; christian.brauner@ubuntu.com; containers@lists.linux.dev; dmitry.kasatkin@gmail.com; ebiederm@xmission.com; Krzysztof Struczynski <krzysztof.struczynski@huawei.com>; Roberto Sassu <roberto.sassu@huawei.com>; mpeters@redhat.com; lhinds@redhat.com; lsturman@redhat.com; puiterwi@redhat.com; jamjoom@us.ibm.com; linux-kernel@vger.kernel.org; paul@paul-moore.com; rgb@redhat.com; linux-security-module@vger.kernel.org; jmorris@namei.org; Denis Semakin <denis.semakin@huawei.com>
Subject: Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability

On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
> On 12/1/21 11:58, James Bottomley wrote:
> > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > From: Denis Semakin <denis.semakin@huawei.com>
> > > 
> > > Use integrity_admin_ns_capable() to check corresponding capability 
> > > to allow read/write IMA policy without CAP_SYS_ADMIN but with 
> > > CAP_INTEGRITY_ADMIN.
> > > 
> > > Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
> > > ---
> > >   security/integrity/ima/ima_fs.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/security/integrity/ima/ima_fs.c 
> > > b/security/integrity/ima/ima_fs.c index fd2798f2d224..6766bb8262f2 
> > > 100644
> > > --- a/security/integrity/ima/ima_fs.c
> > > +++ b/security/integrity/ima/ima_fs.c
> > > @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode 
> > > *inode, struct file *filp)
> > >   #else
> > >   		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
> > >   			return -EACCES;
> > > -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> > > +		if (!integrity_admin_ns_capable(ns->user_ns))
> > so this one is basically replacing what you did in RFC 16/20, which 
> > seems a little redundant.
> > 
> > The question I'd like to ask is: is there still a reason for needing 
> > CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty much 
> > tied to requiring a user (and a mount, because of
> > securityfs_ns) namespace, there might not be a pressing need for an 
> > admin capability separated from CAP_SYS_ADMIN because the owner of 
> > the user namespace passes the ns_capable(..., CAP_SYS_ADMIN) check.  
> > The rationale in
> 
> Casey suggested using CAP_MAC_ADMIN, which I think would also work.
> 
>      CAP_MAC_ADMIN (since Linux 2.6.25)
>                Allow MAC configuration or state changes. Implemented 
> for
>                the Smack Linux Security Module (LSM).
> 
> 
> Down the road I think we should cover setting file extended attributes 
> with the same capability as well for when a user signs files or 
> installs packages with file signatures.  A container runtime could 
> hold CAP_SYS_ADMIN while setting up a container and mounting 
> filesystems and drop it for the first process started there. Since we 
> are using the user namespace to spawn an IMA namespace, we would then 
> require CAP_SYSTEM_ADMIN to be left available so that the user can do 
> IMA related stuff in the container (set or append to the policy, write 
> file signatures). I am not sure whether that should be the case or 
> rather give the user something finer grained, such as CAP_MAC_ADMIN. 
> So, it's about granularity...

It's possible ... any orchestration system that doesn't enter a user
namespace has to strictly regulate capabilities.   I'm probably biased
because I always use a user_ns so I never really had to mess with capabilities.

> > https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerat
> > ions
> > 
> > Is effectively "because CAP_SYS_ADMIN is too powerful" but that's no 
> > longer true of the user namespace owner.  It only passes the
> > ns_capable() check not the capable() one, so while it does get 
> > CAP_SYS_ADMIN, it can only use it in a few situations which 
> > represent quite a power reduction already.
> 
> At least docker containers drop CAP_SYS_ADMIN.

Well docker doesn't use the user_ns.  But even given that, CAP_SYS_ADMIN is always dropped for most container systems.  What happens when you enter a user namespace is the ns_capable( ...,
CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns, in the same way it would for root.  So effectively entering a user namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that responds only in the places in the kernel that have a ns_capable() check instead of a capable() one (most of the places you list below). 
This is the principle of how unprivileged containers actually work ...
and the source of some of our security problems if you get back an ability to do something you shouldn't be allowed to do as an unprivileged user.

>  I am not sure what the decision was based on but probably they don't 
> want to give the user what is not absolutely necessary, but usage of 
> user namespaces (with IMA namespaces) would kind of force it to be 
> available then to do IMA-related stuff ...
> 
> Following this man page here
> https://man7.org/linux/man-pages/man7/user_namespaces.7.html
> 
> CAP_SYS_ADMIN in a user namespace is about
> 
> - bind-mounting filesystems
> 
> - mounting /proc filesystems
> 
> - creating nested user namespaces
> 
> - configuring UTS namespace
> 
> - configuring whether setgroups() can be used
> 
> - usage of setns()
> 
> 
> Do we want to add '- only way of *setting up* IMA related stuff' to 
> this list?

I don't see why not, but other container people should weigh in because, as I said, I mostly use the user namespace and unprivileged containers and don't bother with capabilities.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-02  7:16         ` Denis Semakin
@ 2021-12-02 12:33           ` James Bottomley
  2021-12-02 17:54           ` Stefan Berger
  1 sibling, 0 replies; 54+ messages in thread
From: James Bottomley @ 2021-12-02 12:33 UTC (permalink / raw)
  To: Denis Semakin, Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, Krzysztof Struczynski, Roberto Sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Thu, 2021-12-02 at 07:16 +0000, Denis Semakin wrote:
> Obviously the main goal by adding new capability was to avoid the
> using CAP_SYS_ADMIN (IOW superuser)

OK, but as I've said a couple of times now: the check for CAP_SYS_ADMIN
doesn't have to be monolithic like this.  We have two sets of checks in
the kernel: capable(..., CAP_SYS_ADMIN) which is for the global
monolithic root like capability and ns_capable(..., CAP_SYS_ADMIN)
which is for the owner (possibly unprivileged) of the user namespace.

This gives us a way of parsing out admin capabilites into the small
subset that the user namespace needs.  Patch 16 changed the check from
capable to ns_capable, meaning it's no longer the monolithic
CAP_SYS_ADMIN.

> to manage IMA stuff, that was also about security granularity.  It's
> good if CAP_MAC_ADMIN will be enough for doing IMA related things
> (write policies and extended attributes).

To be honest, as long as the check resolves to ns_capable(...,
CAP_SYS_<SOMETHING>) I'm not that bothered because the owner of the
user namespace will still pass the check.

> But for me it's a little bit unclear how to deal with unprivileged
> users: assuming there's no CAP_INTEGRITY_ADMIN but CAP_MAC_ADMIN was
> set up, so in this case user can control any LSM (seLinux, SMACK,
> etc) and IMA (policies, xattrs). What if .. for some systems there
> would be some requirements that will allow to touch LSM but do not
> change any IMA (integrity) things? A user can set up any IMA policy
> (it's about the system integrity), modify IMA related xattrs but it's
> forbidden to change seLinux policies and e.g. SMACK labels... May be
> it's unreal scenario of course... but I guess it's not 100%
> impossible.

This is why looking at it as a switch from capable to ns_capable is
useful: an ordinary user can assume ns_capable(..., CAP_SYS_ADMIN)
powers arbitrarily, so its a significant check on where you can make
the switch.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-11-30 16:06 ` [RFC 08/20] ima: Move measurement list related variables " Stefan Berger
@ 2021-12-02 12:46   ` James Bottomley
  2021-12-02 13:41     ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-02 12:46 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> Move measurement list related variables into the ima_namespace. This
> way a
> front-end like SecurityFS can show the measurement list inside an IMA
> namespace.
> 
> Implement ima_free_measurements() to free a list of measurements
> and call it when an IMA namespace is deleted.

This one worries me quite a lot.  What seems to be happening in this
code:

> @@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct
> ima_namespace *ns,
>         qe->entry = entry;
>  
>         INIT_LIST_HEAD(&qe->later);
> -       list_add_tail_rcu(&qe->later, &ima_measurements);
> +       list_add_tail_rcu(&qe->later, &ns->ima_measurements);
>  
>         atomic_long_inc(&ns->ima_htable.len);
>         if (update_htable) {
> 

is that we now only add the measurements to the namespace list, but
that list is freed when the namespace dies.  However, the measurement
is still extended through the PCRs meaning we have incomplete
information for a replay after the namespace dies?

I tend to think the way this should work is that until we have a way of
attesting inside the namespace, all measurements should go into the
physical log, so that replay is always complete for the PCRs, so
effectively the visible log of the namespace would always have to be a
subset of the physical log.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-01 19:29       ` James Bottomley
  2021-12-02  7:16         ` Denis Semakin
@ 2021-12-02 12:59         ` Christian Brauner
  2021-12-02 13:01           ` Christian Brauner
  1 sibling, 1 reply; 54+ messages in thread
From: Christian Brauner @ 2021-12-02 12:59 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Berger, linux-integrity, zohar, serge, containers,
	dmitry.kasatkin, ebiederm, krzysztof.struczynski, roberto.sassu,
	mpeters, lhinds, lsturman, puiterwi, jamjoom, linux-kernel, paul,
	rgb, linux-security-module, jmorris, Denis Semakin

On Wed, Dec 01, 2021 at 02:29:09PM -0500, James Bottomley wrote:
> On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
> > On 12/1/21 11:58, James Bottomley wrote:
> > > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > > From: Denis Semakin <denis.semakin@huawei.com>
> > > > 
> > > > Use integrity_admin_ns_capable() to check corresponding
> > > > capability to allow read/write IMA policy without CAP_SYS_ADMIN
> > > > but with CAP_INTEGRITY_ADMIN.
> > > > 
> > > > Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
> > > > ---
> > > >   security/integrity/ima/ima_fs.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/security/integrity/ima/ima_fs.c
> > > > b/security/integrity/ima/ima_fs.c
> > > > index fd2798f2d224..6766bb8262f2 100644
> > > > --- a/security/integrity/ima/ima_fs.c
> > > > +++ b/security/integrity/ima/ima_fs.c
> > > > @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
> > > > *inode,
> > > > struct file *filp)
> > > >   #else
> > > >   		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
> > > >   			return -EACCES;
> > > > -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> > > > +		if (!integrity_admin_ns_capable(ns->user_ns))
> > > so this one is basically replacing what you did in RFC 16/20, which
> > > seems a little redundant.
> > > 
> > > The question I'd like to ask is: is there still a reason for
> > > needing CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty
> > > much tied to requiring a user (and a mount, because of
> > > securityfs_ns) namespace, there might not be a pressing need for an
> > > admin capability separated from CAP_SYS_ADMIN because the owner of
> > > the user namespace passes the ns_capable(..., CAP_SYS_ADMIN)
> > > check.  The rationale in
> > 
> > Casey suggested using CAP_MAC_ADMIN, which I think would also work.
> > 
> >      CAP_MAC_ADMIN (since Linux 2.6.25)
> >                Allow MAC configuration or state changes. Implemented
> > for
> >                the Smack Linux Security Module (LSM).
> > 
> > 
> > Down the road I think we should cover setting file extended
> > attributes with the same capability as well for when a user signs
> > files or installs packages with file signatures.  A container runtime
> > could hold CAP_SYS_ADMIN while setting up a container and mounting
> > filesystems and drop it for the first process started there. Since we
> > are using the user namespace to spawn an IMA namespace, we would then
> > require CAP_SYSTEM_ADMIN to be left available so that the user can do
> > IMA related stuff in the container (set or append to the policy,
> > write file signatures). I am not sure whether that should be the case
> > or rather give the user something finer grained, such as
> > CAP_MAC_ADMIN. So, it's about granularity...
> 
> It's possible ... any orchestration system that doesn't enter a user
> namespace has to strictly regulate capabilities.   I'm probably biased
> because I always use a user_ns so I never really had to mess with
> capabilities.
> 
> > > https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
> > > 
> > > Is effectively "because CAP_SYS_ADMIN is too powerful" but that's
> > > no longer true of the user namespace owner.  It only passes the
> > > ns_capable() check not the capable() one, so while it does get
> > > CAP_SYS_ADMIN, it can only use it in a few situations which
> > > represent quite a power reduction already.
> > 
> > At least docker containers drop CAP_SYS_ADMIN.
> 
> Well docker doesn't use the user_ns.  But even given that,
> CAP_SYS_ADMIN is always dropped for most container systems.  What
> happens when you enter a user namespace is the ns_capable( ...,
> CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns,
> in the same way it would for root.  So effectively entering a user
> namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what
> unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that
> responds only in the places in the kernel that have a ns_capable()
> check instead of a capable() one (most of the places you list below). 
> This is the principle of how unprivileged containers actually work ...
> and the source of some of our security problems if you get back an
> ability to do something you shouldn't be allowed to do as an
> unprivileged user.
> 
> >  I am not sure what the decision was based on but probably they don't
> > want to give the user what is not absolutely necessary, but usage of
> > user namespaces (with IMA namespaces) would kind of force it to be
> > available then to do IMA-related stuff ...
> > 
> > Following this man page here 
> > https://man7.org/linux/man-pages/man7/user_namespaces.7.html
> > 
> > CAP_SYS_ADMIN in a user namespace is about
> > 
> > - bind-mounting filesystems
> > 
> > - mounting /proc filesystems
> > 
> > - creating nested user namespaces
> > 
> > - configuring UTS namespace
> > 
> > - configuring whether setgroups() can be used
> > 
> > - usage of setns()
> > 
> > 
> > Do we want to add '- only way of *setting up* IMA related stuff' to
> > this list?
> 
> I don't see why not, but other container people should weigh in
> because, as I said, I mostly use the user namespace and unprivileged
> containers and don't bother with capabilities.

There are very few scenarios where dropping capabilities in an
unprivileged container makes sense. In a lot of other scenarios it is
just a misunderstanding of the meaning of capabilities and their
relationship to user namespaces. Usually, granting a full set of
capabilities to the payload of an unprivigileged container is the right
thing to do. All things that are properly namespaced will check
capabilities in the relevant user namespace. Those that aren't will
check them against the initial user namespaces.

But I do think the question of whether or not ima should go into
cap_sys_admin is more a question of capability semantics then it is in
how exactly ima is namespaced. We do have agreed before that overloading
cap_sys_admin further isn't ideal. Often we end up rectifying that
mistake later. For example, how we moved stuff like criu, bpf, and perf
to their own capability. Now we're left with stuff like:

static inline bool perfmon_capable(void)
{
	return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
}

static inline bool bpf_capable(void)
{
	return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
}

static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
{
	return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
		ns_capable(ns, CAP_SYS_ADMIN);
}

for the sake of adhering to legacy behavior. I think we can skip over
that mistake and introduce cap_sys_integrity.

Christian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-02 12:59         ` Christian Brauner
@ 2021-12-02 13:01           ` Christian Brauner
  2021-12-02 15:58             ` Casey Schaufler
  0 siblings, 1 reply; 54+ messages in thread
From: Christian Brauner @ 2021-12-02 13:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Berger, linux-integrity, zohar, serge, containers,
	dmitry.kasatkin, ebiederm, krzysztof.struczynski, roberto.sassu,
	mpeters, lhinds, lsturman, puiterwi, jamjoom, linux-kernel, paul,
	rgb, linux-security-module, jmorris, Denis Semakin

On Thu, Dec 02, 2021 at 01:59:55PM +0100, Christian Brauner wrote:
> On Wed, Dec 01, 2021 at 02:29:09PM -0500, James Bottomley wrote:
> > On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
> > > On 12/1/21 11:58, James Bottomley wrote:
> > > > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > > > From: Denis Semakin <denis.semakin@huawei.com>
> > > > > 
> > > > > Use integrity_admin_ns_capable() to check corresponding
> > > > > capability to allow read/write IMA policy without CAP_SYS_ADMIN
> > > > > but with CAP_INTEGRITY_ADMIN.
> > > > > 
> > > > > Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
> > > > > ---
> > > > >   security/integrity/ima/ima_fs.c | 2 +-
> > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/security/integrity/ima/ima_fs.c
> > > > > b/security/integrity/ima/ima_fs.c
> > > > > index fd2798f2d224..6766bb8262f2 100644
> > > > > --- a/security/integrity/ima/ima_fs.c
> > > > > +++ b/security/integrity/ima/ima_fs.c
> > > > > @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
> > > > > *inode,
> > > > > struct file *filp)
> > > > >   #else
> > > > >   		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
> > > > >   			return -EACCES;
> > > > > -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> > > > > +		if (!integrity_admin_ns_capable(ns->user_ns))
> > > > so this one is basically replacing what you did in RFC 16/20, which
> > > > seems a little redundant.
> > > > 
> > > > The question I'd like to ask is: is there still a reason for
> > > > needing CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty
> > > > much tied to requiring a user (and a mount, because of
> > > > securityfs_ns) namespace, there might not be a pressing need for an
> > > > admin capability separated from CAP_SYS_ADMIN because the owner of
> > > > the user namespace passes the ns_capable(..., CAP_SYS_ADMIN)
> > > > check.  The rationale in
> > > 
> > > Casey suggested using CAP_MAC_ADMIN, which I think would also work.
> > > 
> > >      CAP_MAC_ADMIN (since Linux 2.6.25)
> > >                Allow MAC configuration or state changes. Implemented
> > > for
> > >                the Smack Linux Security Module (LSM).
> > > 
> > > 
> > > Down the road I think we should cover setting file extended
> > > attributes with the same capability as well for when a user signs
> > > files or installs packages with file signatures.  A container runtime
> > > could hold CAP_SYS_ADMIN while setting up a container and mounting
> > > filesystems and drop it for the first process started there. Since we
> > > are using the user namespace to spawn an IMA namespace, we would then
> > > require CAP_SYSTEM_ADMIN to be left available so that the user can do
> > > IMA related stuff in the container (set or append to the policy,
> > > write file signatures). I am not sure whether that should be the case
> > > or rather give the user something finer grained, such as
> > > CAP_MAC_ADMIN. So, it's about granularity...
> > 
> > It's possible ... any orchestration system that doesn't enter a user
> > namespace has to strictly regulate capabilities.   I'm probably biased
> > because I always use a user_ns so I never really had to mess with
> > capabilities.
> > 
> > > > https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
> > > > 
> > > > Is effectively "because CAP_SYS_ADMIN is too powerful" but that's
> > > > no longer true of the user namespace owner.  It only passes the
> > > > ns_capable() check not the capable() one, so while it does get
> > > > CAP_SYS_ADMIN, it can only use it in a few situations which
> > > > represent quite a power reduction already.
> > > 
> > > At least docker containers drop CAP_SYS_ADMIN.
> > 
> > Well docker doesn't use the user_ns.  But even given that,
> > CAP_SYS_ADMIN is always dropped for most container systems.  What
> > happens when you enter a user namespace is the ns_capable( ...,
> > CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns,
> > in the same way it would for root.  So effectively entering a user
> > namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what
> > unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that
> > responds only in the places in the kernel that have a ns_capable()
> > check instead of a capable() one (most of the places you list below). 
> > This is the principle of how unprivileged containers actually work ...
> > and the source of some of our security problems if you get back an
> > ability to do something you shouldn't be allowed to do as an
> > unprivileged user.
> > 
> > >  I am not sure what the decision was based on but probably they don't
> > > want to give the user what is not absolutely necessary, but usage of
> > > user namespaces (with IMA namespaces) would kind of force it to be
> > > available then to do IMA-related stuff ...
> > > 
> > > Following this man page here 
> > > https://man7.org/linux/man-pages/man7/user_namespaces.7.html
> > > 
> > > CAP_SYS_ADMIN in a user namespace is about
> > > 
> > > - bind-mounting filesystems
> > > 
> > > - mounting /proc filesystems
> > > 
> > > - creating nested user namespaces
> > > 
> > > - configuring UTS namespace
> > > 
> > > - configuring whether setgroups() can be used
> > > 
> > > - usage of setns()
> > > 
> > > 
> > > Do we want to add '- only way of *setting up* IMA related stuff' to
> > > this list?
> > 
> > I don't see why not, but other container people should weigh in
> > because, as I said, I mostly use the user namespace and unprivileged
> > containers and don't bother with capabilities.
> 
> There are very few scenarios where dropping capabilities in an
> unprivileged container makes sense. In a lot of other scenarios it is
> just a misunderstanding of the meaning of capabilities and their
> relationship to user namespaces. Usually, granting a full set of
> capabilities to the payload of an unprivigileged container is the right
> thing to do. All things that are properly namespaced will check
> capabilities in the relevant user namespace. Those that aren't will
> check them against the initial user namespaces.
> 
> But I do think the question of whether or not ima should go into
> cap_sys_admin is more a question of capability semantics then it is in
> how exactly ima is namespaced. We do have agreed before that overloading
> cap_sys_admin further isn't ideal. Often we end up rectifying that
> mistake later. For example, how we moved stuff like criu, bpf, and perf
> to their own capability. Now we're left with stuff like:
> 
> static inline bool perfmon_capable(void)
> {
> 	return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
> }
> 
> static inline bool bpf_capable(void)
> {
> 	return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
> }
> 
> static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
> {
> 	return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
> 		ns_capable(ns, CAP_SYS_ADMIN);
> }
> 
> for the sake of adhering to legacy behavior. I think we can skip over
> that mistake and introduce cap_sys_integrity.

(Or under CAP_MAC_ADMIN as suggested elsewhere in the thread as I saw
just now.)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-11-30 16:06 ` [RFC 20/20] ima: Setup securityfs_ns " Stefan Berger
  2021-12-01 17:56   ` James Bottomley
@ 2021-12-02 13:18   ` Christian Brauner
  2021-12-02 13:52     ` Stefan Berger
  1 sibling, 1 reply; 54+ messages in thread
From: Christian Brauner @ 2021-12-02 13:18 UTC (permalink / raw)
  To: Stefan Berger
  Cc: linux-integrity, zohar, serge, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Tue, Nov 30, 2021 at 11:06:54AM -0500, Stefan Berger wrote:
> Setup securityfs_ns with symlinks, directories, and files for IMA
> namespacing support. The same directory structure that IMA uses on the
> host is also created for the namespacing case.
> 
> Increment the user namespace's refcount_teardown value by '1' once
> securityfs_ns has been successfully setup since the initialization of the
> filesystem causes an additional reference to the user namespace to be
> taken. The early teardown function will delete the file system and release
> the additional reference.
> 
> The securityfs_ns file and directory ownerships cannot be set when the
> filesystem is setup since at this point the user namespace has not been
> configured yet by the user and therefore the ownership mappings are not
> available, yet. Therefore, adjust the file and directory ownerships when
> an inode's function for determining the permissions of a file or directory
> is accessed.
> 
> This filesystem can now be mounted as follows:
> 
> mount -t securityfs_ns /sys/kernel/security/ /sys/kernel/security/
> 
> The following directories, symlinks, and files are then available.
> 
> $ ls -l sys/kernel/security/
> total 0
> lr--r--r--. 1 nobody nobody 0 Nov 27 06:44 ima -> integrity/ima
> drwxr-xr-x. 3 nobody nobody 0 Nov 27 06:44 integrity
> 
> $ ls -l sys/kernel/security/ima/
> total 0
> -r--r-----. 1 root root 0 Nov 27 06:44 ascii_runtime_measurements
> -r--r-----. 1 root root 0 Nov 27 06:44 binary_runtime_measurements
> -rw-------. 1 root root 0 Nov 27 06:44 policy
> -r--r-----. 1 root root 0 Nov 27 06:44 runtime_measurements_count
> -r--r-----. 1 root root 0 Nov 27 06:44 violations
> 
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>  include/linux/ima.h                      |  17 +++
>  security/integrity/ima/ima.h             |   2 +
>  security/integrity/ima/ima_fs.c          | 178 ++++++++++++++++++++++-
>  security/integrity/ima/ima_init_ima_ns.c |   6 +-
>  security/integrity/ima/ima_ns.c          |   4 +-
>  5 files changed, 203 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/ima.h b/include/linux/ima.h
> index fe08919df326..a2c5e516f706 100644
> --- a/include/linux/ima.h
> +++ b/include/linux/ima.h
> @@ -221,6 +221,18 @@ struct ima_h_table {
>  	struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
>  };
>  
> +enum {
> +	IMAFS_DENTRY_INTEGRITY_DIR = 0,
> +	IMAFS_DENTRY_DIR,
> +	IMAFS_DENTRY_SYMLINK,
> +	IMAFS_DENTRY_BINARY_RUNTIME_MEASUREMENTS,
> +	IMAFS_DENTRY_ASCII_RUNTIME_MEASUREMENTS,
> +	IMAFS_DENTRY_RUNTIME_MEASUREMENTS_COUNT,
> +	IMAFS_DENTRY_VIOLATIONS,
> +	IMAFS_DENTRY_IMA_POLICY,
> +	IMAFS_DENTRY_LAST
> +};
> +
>  struct ima_namespace {
>  	struct kref kref;
>  	struct user_namespace *user_ns;
> @@ -267,6 +279,11 @@ struct ima_namespace {
>  	struct mutex ima_write_mutex;
>  	unsigned long ima_fs_flags;
>  	int valid_policy;
> +
> +	struct dentry *dentry[IMAFS_DENTRY_LAST];
> +	struct vfsmount *mount;
> +	int mount_count;
> +	bool file_ownership_fixes_done;
>  };
>  
>  extern struct ima_namespace init_ima_ns;
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index bb9763cd5fb1..9bcd71bb716c 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -139,6 +139,8 @@ struct ns_status {
>  /* Internal IMA function definitions */
>  int ima_init(void);
>  int ima_fs_init(void);
> +int ima_fs_ns_init(struct ima_namespace *ns);
> +void ima_fs_ns_free(struct ima_namespace *ns);
>  int ima_add_template_entry(struct ima_namespace *ns,
>  			   struct ima_template_entry *entry, int violation,
>  			   const char *op, struct inode *inode,
> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
> index 6766bb8262f2..9a14be520268 100644
> --- a/security/integrity/ima/ima_fs.c
> +++ b/security/integrity/ima/ima_fs.c
> @@ -22,6 +22,7 @@
>  #include <linux/parser.h>
>  #include <linux/vmalloc.h>
>  #include <linux/ima.h>
> +#include <linux/namei.h>
>  
>  #include "ima.h"
>  
> @@ -436,8 +437,13 @@ static int ima_release_policy(struct inode *inode, struct file *file)
>  
>  	ima_update_policy(ns);
>  #if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
> -	securityfs_remove(ima_policy);
> -	ima_policy = NULL;
> +	if (ns == &init_ima_ns) {
> +		securityfs_remove(ima_policy);
> +		ima_policy = NULL;
> +	} else {
> +		securityfs_ns_remove(ns->dentry[IMAFS_DENTRY_POLICY]);
> +		ns->dentry[IMAFS_DENTRY_POLICY] = NULL;
> +	}
>  #elif defined(CONFIG_IMA_WRITE_POLICY)
>  	clear_bit(IMA_FS_BUSY, &ns->ima_fs_flags);
>  #elif defined(CONFIG_IMA_READ_POLICY)
> @@ -509,3 +515,171 @@ int __init ima_fs_init(void)
>  	securityfs_remove(ima_policy);
>  	return -1;
>  }
> +
> +/*
> + * Fix the ownership (uid/gid) of the dentry's that couldn't be set at the
> + * time of their creation because the user namespace wasn't configured, yet.
> + */
> +static void ima_fs_ns_fixup_uid_gid(struct ima_namespace *ns)
> +{
> +	struct inode *inode;
> +	size_t i;
> +
> +	if (ns->file_ownership_fixes_done ||
> +	    ns->user_ns->uid_map.nr_extents == 0)
> +		return;
> +
> +	ns->file_ownership_fixes_done = true;
> +	for (i = 0; i < IMAFS_DENTRY_LAST; i++) {
> +		if (!ns->dentry[i])
> +			continue;
> +		inode = ns->dentry[i]->d_inode;
> +		inode->i_uid = make_kuid(ns->user_ns, 0);
> +		inode->i_gid = make_kgid(ns->user_ns, 0);
> +	}
> +}
> +
> +/* Fix the permissions when a file is opened */
> +int ima_fs_ns_permission(struct user_namespace *mnt_userns, struct inode *inode,
> +			 int mask)
> +{
> +	ima_fs_ns_fixup_uid_gid(get_current_ns());

As noted later in the thread if this is required it means something is
buggy in the current code. That shouldn't be needed.

I think there's a more fundamental issue here. The correct way to do all
this would be to restructure securityfs at least how it works inside of
user namespaces. Currently, securityfs works like debugfs: a single
shared superblock that is pinned by each new inode that is created via:

	simple_pin_fs(&fs_type, &mount, &mount_count);
	simple_release_fs(&mount, &mount_count);

and each mount surfaces the same superblock. Ideally making securityfs
mountable inside of user namespaces should get you a new superblock.
Functions that create files for the ima ns would then be called inside
->fill_super etc.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 13/20] securityfs: Build securityfs_ns for namespacing support
  2021-11-30 16:06 ` [RFC 13/20] securityfs: Build securityfs_ns for namespacing support Stefan Berger
@ 2021-12-02 13:35   ` Christian Brauner
  2021-12-02 13:47     ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: Christian Brauner @ 2021-12-02 13:35 UTC (permalink / raw)
  To: Stefan Berger
  Cc: linux-integrity, zohar, serge, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Tue, Nov 30, 2021 at 11:06:47AM -0500, Stefan Berger wrote:
> Implement 'securityfs_ns' for support of IMA namespacing so that each
> IMA (user) namespace can have its own front-end for showing the currently
> active policy, the measurement list, number of violations and so on. This
> filesystem shares much of the existing code of SecurityFS but requires a
> new API call securityfs_ns_create_mount() for creating a new instance.
> 
> The API calls of securityfs_ns have the prefix securityfs_ns_ and take
> additional parameters struct vfsmount * and mount_count that allow for
> multiple instances of this filesystem to exist.
> 
> The filesystem can be mounted to the usual securityfs mount point like
> this:
> 
> mount -t securityfs_ns /sys/kernel/security /sys/kernel/security
> 
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>  include/linux/security.h   |  18 ++++
>  include/uapi/linux/magic.h |   1 +
>  security/inode.c           | 197 +++++++++++++++++++++++++++++++++++--
>  3 files changed, 210 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 7e0ba63b5dde..8e479266f544 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -1929,6 +1929,24 @@ struct dentry *securityfs_create_symlink(const char *name,
>  					 const struct inode_operations *iops);
>  extern void securityfs_remove(struct dentry *dentry);
>  
> +extern struct dentry *securityfs_ns_create_file(const char *name, umode_t mode,
> +						struct dentry *parent, void *data,
> +						const struct file_operations *fops,
> +						const struct inode_operations *iops,
> +						struct vfsmount **mount, int *mount_count);
> +extern struct dentry *securityfs_ns_create_dir(const char *name, struct dentry *parent,
> +					       const struct inode_operations *iops,
> +					       struct vfsmount **mount, int *mount_count);
> +struct dentry *securityfs_ns_create_symlink(const char *name,
> +					    struct dentry *parent,
> +					    const char *target,
> +					    const struct inode_operations *iops,
> +					    struct vfsmount **mount, int *mount_count);
> +extern void securityfs_ns_remove(struct dentry *dentry,
> +				 struct vfsmount **mount, int *mount_count);
> +struct vfsmount *securityfs_ns_create_mount(struct user_namespace *user_ns);
> +extern struct vfsmount *securityfs_ns_mount;
> +
>  #else /* CONFIG_SECURITYFS */
>  
>  static inline struct dentry *securityfs_create_dir(const char *name,
> diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
> index 35687dcb1a42..5c1cc6088dd2 100644
> --- a/include/uapi/linux/magic.h
> +++ b/include/uapi/linux/magic.h
> @@ -11,6 +11,7 @@
>  #define CRAMFS_MAGIC_WEND	0x453dcd28	/* magic number with the wrong endianess */
>  #define DEBUGFS_MAGIC          0x64626720
>  #define SECURITYFS_MAGIC	0x73636673
> +#define SECURITYFS_NS_MAGIC	0x73334473
>  #define SELINUX_MAGIC		0xf97cff8c
>  #define SMACK_MAGIC		0x43415d53	/* "SMAC" */
>  #define RAMFS_MAGIC		0x858458f6	/* some random number */
> diff --git a/security/inode.c b/security/inode.c
> index 429744ff4ab3..8077d1f31489 100644
> --- a/security/inode.c
> +++ b/security/inode.c
> @@ -21,6 +21,7 @@
>  #include <linux/security.h>
>  #include <linux/lsm_hooks.h>
>  #include <linux/magic.h>
> +#include <linux/user_namespace.h>
>  
>  static struct vfsmount *securityfs_mount;
>  static int securityfs_mount_count;
> @@ -73,6 +74,61 @@ static struct file_system_type securityfs_type = {
>  	.kill_sb =	kill_litter_super,
>  };
>  
> +static int securityfs_ns_fill_super(struct super_block *sb, struct fs_context *fc)
> +{
> +	static const struct tree_descr files[] = {{""}};
> +	int error;
> +
> +	error = simple_fill_super(sb, SECURITYFS_NS_MAGIC, files);
> +	if (error)
> +		return error;
> +
> +	sb->s_op = &securityfs_super_operations;
> +
> +	return 0;
> +}
> +
> +static int securityfs_ns_get_tree(struct fs_context *fc)
> +{
> +	return get_tree_keyed(fc, securityfs_ns_fill_super, fc->user_ns);
> +}
> +
> +static const struct fs_context_operations securityfs_ns_context_ops = {
> +	.get_tree	= securityfs_ns_get_tree,
> +};
> +
> +static int securityfs_ns_init_fs_context(struct fs_context *fc)
> +{
> +	fc->ops = &securityfs_ns_context_ops;
> +	return 0;
> +}
> +
> +static struct file_system_type securityfs_ns_type = {
> +	.owner			= THIS_MODULE,
> +	.name			= "securityfs_ns",
> +	.init_fs_context	= securityfs_ns_init_fs_context,
> +	.kill_sb		= kill_litter_super,
> +	.fs_flags		= FS_USERNS_MOUNT,
> +};
> +
> +struct vfsmount *securityfs_ns_create_mount(struct user_namespace *user_ns)
> +{
> +	struct fs_context *fc;
> +	struct vfsmount *mnt;
> +
> +	fc = fs_context_for_mount(&securityfs_ns_type, SB_KERNMOUNT);
> +	if (IS_ERR(fc))
> +		return ERR_CAST(fc);
> +
> +	put_user_ns(fc->user_ns);
> +	fc->user_ns = get_user_ns(user_ns);
> +
> +	mnt = fc_mount(fc);
> +	put_fs_context(fc);
> +	return mnt;
> +}
> +
> +
>  /**
>   * securityfs_create_dentry - create a dentry in the securityfs filesystem
>   *
> @@ -155,8 +211,8 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>  	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
>  	inode->i_private = data;
>  	if (S_ISDIR(mode)) {
> -		inode->i_op = &simple_dir_inode_operations;
> -		inode->i_fop = &simple_dir_operations;
> +		inode->i_op = iops ? iops : &simple_dir_inode_operations;
> +		inode->i_fop = fops ? fops : &simple_dir_operations;
>  		inc_nlink(inode);
>  		inc_nlink(dir);
>  	} else if (S_ISLNK(mode)) {
> @@ -214,6 +270,41 @@ struct dentry *securityfs_create_file(const char *name, umode_t mode,
>  }
>  EXPORT_SYMBOL_GPL(securityfs_create_file);
>  
> +/**
> + * securityfs_ns_create_file - create a file in the securityfs_ns filesystem
> + *
> + * @name: a pointer to a string containing the name of the file to create.
> + * @mode: the permission that the file should have
> + * @parent: a pointer to the parent dentry for this file.  This should be a
> + *          directory dentry if set.  If this parameter is %NULL, then the
> + *          file will be created in the root of the securityfs_ns filesystem.
> + * @data: a pointer to something that the caller will want to get to later
> + *        on.  The inode.i_private pointer will point to this value on
> + *        the open() call.
> + * @fops: a pointer to a struct file_operations that should be used for
> + *        this file.
> + * @mount: Pointer to a pointer of a an existing vfsmount
> + * @mount_count: The mount_count that goes along with the @mount
> + *
> + * This function creates a file in securityfs_ns with the given @name.
> + *
> + * This function returns a pointer to a dentry if it succeeds.  This
> + * pointer must be passed to the securityfs_ns_remove() function when the file
> + * is to be removed (no automatic cleanup happens if your module is unloaded,
> + * you are responsible here).  If an error occurs, the function will return
> + * the error value (via ERR_PTR).
> + */
> +struct dentry *securityfs_ns_create_file(const char *name, umode_t mode,
> +					 struct dentry *parent, void *data,
> +					 const struct file_operations *fops,
> +					 const struct inode_operations *iops,
> +					 struct vfsmount **mount, int *mount_count)
> +{
> +	return securityfs_create_dentry(name, mode, parent, data, fops, iops,
> +					&securityfs_ns_type, mount, mount_count);
> +}
> +EXPORT_SYMBOL_GPL(securityfs_ns_create_file);
> +
>  /**
>   * securityfs_create_dir - create a directory in the securityfs filesystem
>   *
> @@ -240,6 +331,34 @@ struct dentry *securityfs_create_dir(const char *name, struct dentry *parent)
>  }
>  EXPORT_SYMBOL_GPL(securityfs_create_dir);
>  
> +/**
> + * securityfs_ns_create_dir - create a directory in the securityfs_ns filesystem
> + *
> + * @name: a pointer to a string containing the name of the directory to
> + *        create.
> + * @parent: a pointer to the parent dentry for this file.  This should be a
> + *          directory dentry if set.  If this parameter is %NULL, then the
> + *          directory will be created in the root of the securityfs_ns filesystem.
> + * @mount: Pointer to a pointer of a an existing vfsmount
> + * @mount_count: The mount_count that goes along with the @mount
> + *
> + * This function creates a directory in securityfs_ns with the given @name.
> + *
> + * This function returns a pointer to a dentry if it succeeds.  This
> + * pointer must be passed to the securityfs_ns_remove() function when the file
> + * is to be removed (no automatic cleanup happens if your module is unloaded,
> + * you are responsible here).  If an error occurs, the function will return
> + * the error value (via ERR_PTR).
> + */
> +struct dentry *securityfs_ns_create_dir(const char *name, struct dentry *parent,
> +					const struct inode_operations *iops,
> +					struct vfsmount **mount, int *mount_count)
> +{
> +	return securityfs_ns_create_file(name, S_IFDIR | 0755, parent, NULL, NULL,
> +					 iops, mount, mount_count);
> +}
> +EXPORT_SYMBOL_GPL(securityfs_ns_create_dir);
> +
>  struct dentry *_securityfs_create_symlink(const char *name,
>  					  struct dentry *parent,
>  					  const char *target,
> @@ -263,6 +382,7 @@ struct dentry *_securityfs_create_symlink(const char *name,
>  
>  	return dent;
>  }
> +
>  /**
>   * securityfs_create_symlink - create a symlink in the securityfs filesystem
>   *
> @@ -300,6 +420,42 @@ struct dentry *securityfs_create_symlink(const char *name,
>  }
>  EXPORT_SYMBOL_GPL(securityfs_create_symlink);
>  
> +/**
> + * securityfs_ns_create_symlink - create a symlink in the securityfs_ns filesystem
> + *
> + * @name: a pointer to a string containing the name of the symlink to
> + *        create.
> + * @parent: a pointer to the parent dentry for the symlink.  This should be a
> + *          directory dentry if set.  If this parameter is %NULL, then the
> + *          directory will be created in the root of the securityfs_ns filesystem.
> + * @target: a pointer to a string containing the name of the symlink's target.
> + *          If this parameter is %NULL, then the @iops parameter needs to be
> + *          setup to handle .readlink and .get_link inode_operations.
> + * @iops: a pointer to the struct inode_operations to use for the symlink. If
> + *        this parameter is %NULL, then the default simple_symlink_inode
> + *        operations will be used.
> + * @mount: Pointer to a pointer of a an existing vfsmount
> + * @mount_count: The mount_count that goes along with the @mount
> + *
> + * This function creates a symlink in securityfs_ns with the given @name.
> + *
> + * This function returns a pointer to a dentry if it succeeds.  This
> + * pointer must be passed to the securityfs_ns_remove() function when the file
> + * is to be removed (no automatic cleanup happens if your module is unloaded,
> + * you are responsible here).  If an error occurs, the function will return
> + * the error value (via ERR_PTR).
> + */
> +struct dentry *securityfs_ns_create_symlink(const char *name,
> +					    struct dentry *parent,
> +					    const char *target,
> +					    const struct inode_operations *iops,
> +					    struct vfsmount **mount, int *mount_count)
> +{
> +	return _securityfs_create_symlink(name, parent, target, iops,
> +					  &securityfs_ns_type, mount, mount_count);
> +}
> +EXPORT_SYMBOL_GPL(securityfs_ns_create_symlink);
> +
>  void _securityfs_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
>  {
>  	struct inode *dir;
> @@ -340,6 +496,27 @@ void securityfs_remove(struct dentry *dentry)
>  
>  EXPORT_SYMBOL_GPL(securityfs_remove);
>  
> +/**
> + * securityfs_ns_remove - removes a file or directory from the securityfs_ns filesystem
> + *
> + * @dentry: a pointer to a the dentry of the file or directory to be removed.
> + * @mount: Pointer to a pointer of a an existing vfsmount
> + * @mount_count: The mount_count that goes along with the @mount
> + *
> + * This function removes a file or directory in securityfs_ns that was previously
> + * created with a call to another securityfs_ns function (like
> + * securityfs_ns_create_file() or variants thereof.)
> + *
> + * This function is required to be called in order for the file to be
> + * removed. No automatic cleanup of files will happen when a module is
> + * removed; you are responsible here.
> + */
> +void securityfs_ns_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
> +{
> +	_securityfs_remove(dentry, mount, mount_count);
> +}
> +EXPORT_SYMBOL_GPL(securityfs_ns_remove);
> +
>  #ifdef CONFIG_SECURITY
>  static struct dentry *lsm_dentry;
>  static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
> @@ -364,14 +541,22 @@ static int __init securityfs_init(void)
>  		return retval;
>  
>  	retval = register_filesystem(&securityfs_type);
> -	if (retval) {
> -		sysfs_remove_mount_point(kernel_kobj, "security");
> -		return retval;
> -	}
> +	if (retval)
> +		goto remove_mount;
> +	retval = register_filesystem(&securityfs_ns_type);
> +	if (retval)
> +		goto unregister_filesystem;

So you're introducing a new filesystem type securityfs_ns. Ithink that's
simply wrong and feels like a hack. What issues did you run into when
trying to convert the existing securityfs itself?

I see no immediate reason why a get_tree_keyed() conversion for
securityfs wouldn't work even with the debugfs pin/unpin logic in there
kept for the securityfs mounted in the initial userns.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-12-02 12:46   ` James Bottomley
@ 2021-12-02 13:41     ` Stefan Berger
  2021-12-02 16:29       ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-02 13:41 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/2/21 07:46, James Bottomley wrote:
> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>> Move measurement list related variables into the ima_namespace. This
>> way a
>> front-end like SecurityFS can show the measurement list inside an IMA
>> namespace.
>>
>> Implement ima_free_measurements() to free a list of measurements
>> and call it when an IMA namespace is deleted.
> This one worries me quite a lot.  What seems to be happening in this
> code:
>
>> @@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct
>> ima_namespace *ns,
>>          qe->entry = entry;
>>   
>>          INIT_LIST_HEAD(&qe->later);
>> -       list_add_tail_rcu(&qe->later, &ima_measurements);
>> +       list_add_tail_rcu(&qe->later, &ns->ima_measurements);
>>   
>>          atomic_long_inc(&ns->ima_htable.len);
>>          if (update_htable) {
>>
> is that we now only add the measurements to the namespace list, but
> that list is freed when the namespace dies.  However, the measurement
> is still extended through the PCRs meaning we have incomplete
> information for a replay after the namespace dies?

*Not at all.* The measurement list of the namespace is independent of 
the host.

The cover letter states:

"The following lines added to a suitable IMA policy on the host would
cause the execution of the commands inside the container (by uid 1000)
to be measured and audited as well on the host, thus leading to two
auditing messages for the 'busybox cat' above and log entries in IMA's
system log.

echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
         "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
     > /sys/kernel/security/ima/policy   "

So even now, with only auditing support in the namespace, you would get 
measurements in the host log with an appropriately written IMA policy. 
The measurements in the host log won't go away when the namespace dies.

The intention is to provide flexibility that allows for writing the IMA 
policy of the host in such a way

- that file accesses occurring in namespaces get measured on the host

- that file accesses occurring in the namespace do NOT get measured on 
the host and protect the host log from ever growing or actions in 
namespaces intentionally growing the host log

There would be a namespace policy that would allow for logging inside 
the namespace. Combine this with the policy on the host and you can have 
no measurements of the namespace file access, measurements in either the 
host log or the namespace log or both. What I would be worried about is 
if the flexibility wasn't there.


>
> I tend to think the way this should work is that until we have a way of
> attesting inside the namespace, all measurements should go into the
> physical log, so that replay is always complete for the PCRs, so
> effectively the visible log of the namespace would always have to be a
> subset of the physical log.

Per the cover letter description this is already possible today.

    Stefan


>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 13/20] securityfs: Build securityfs_ns for namespacing support
  2021-12-02 13:35   ` Christian Brauner
@ 2021-12-02 13:47     ` Stefan Berger
  0 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-12-02 13:47 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-integrity, zohar, serge, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/2/21 08:35, Christian Brauner wrote:
> On Tue, Nov 30, 2021 at 11:06:47AM -0500, Stefan Berger wrote:
>> Implement 'securityfs_ns' for support of IMA namespacing so that each
>> IMA (user) namespace can have its own front-end for showing the currently
>> active policy, the measurement list, number of violations and so on. This
>> filesystem shares much of the existing code of SecurityFS but requires a
>> new API call securityfs_ns_create_mount() for creating a new instance.
>>
>> The API calls of securityfs_ns have the prefix securityfs_ns_ and take
>> additional parameters struct vfsmount * and mount_count that allow for
>> multiple instances of this filesystem to exist.
>>
>> The filesystem can be mounted to the usual securityfs mount point like
>> this:
>>
>> mount -t securityfs_ns /sys/kernel/security /sys/kernel/security
>>
>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>> ---
>>   include/linux/security.h   |  18 ++++
>>   include/uapi/linux/magic.h |   1 +
>>   security/inode.c           | 197 +++++++++++++++++++++++++++++++++++--
>>   3 files changed, 210 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index 7e0ba63b5dde..8e479266f544 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -1929,6 +1929,24 @@ struct dentry *securityfs_create_symlink(const char *name,
>>   					 const struct inode_operations *iops);
>>   extern void securityfs_remove(struct dentry *dentry);
>>   
>> +extern struct dentry *securityfs_ns_create_file(const char *name, umode_t mode,
>> +						struct dentry *parent, void *data,
>> +						const struct file_operations *fops,
>> +						const struct inode_operations *iops,
>> +						struct vfsmount **mount, int *mount_count);
>> +extern struct dentry *securityfs_ns_create_dir(const char *name, struct dentry *parent,
>> +					       const struct inode_operations *iops,
>> +					       struct vfsmount **mount, int *mount_count);
>> +struct dentry *securityfs_ns_create_symlink(const char *name,
>> +					    struct dentry *parent,
>> +					    const char *target,
>> +					    const struct inode_operations *iops,
>> +					    struct vfsmount **mount, int *mount_count);
>> +extern void securityfs_ns_remove(struct dentry *dentry,
>> +				 struct vfsmount **mount, int *mount_count);
>> +struct vfsmount *securityfs_ns_create_mount(struct user_namespace *user_ns);
>> +extern struct vfsmount *securityfs_ns_mount;
>> +
>>   #else /* CONFIG_SECURITYFS */
>>   
>>   static inline struct dentry *securityfs_create_dir(const char *name,
>> diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
>> index 35687dcb1a42..5c1cc6088dd2 100644
>> --- a/include/uapi/linux/magic.h
>> +++ b/include/uapi/linux/magic.h
>> @@ -11,6 +11,7 @@
>>   #define CRAMFS_MAGIC_WEND	0x453dcd28	/* magic number with the wrong endianess */
>>   #define DEBUGFS_MAGIC          0x64626720
>>   #define SECURITYFS_MAGIC	0x73636673
>> +#define SECURITYFS_NS_MAGIC	0x73334473
>>   #define SELINUX_MAGIC		0xf97cff8c
>>   #define SMACK_MAGIC		0x43415d53	/* "SMAC" */
>>   #define RAMFS_MAGIC		0x858458f6	/* some random number */
>> diff --git a/security/inode.c b/security/inode.c
>> index 429744ff4ab3..8077d1f31489 100644
>> --- a/security/inode.c
>> +++ b/security/inode.c
>> @@ -21,6 +21,7 @@
>>   #include <linux/security.h>
>>   #include <linux/lsm_hooks.h>
>>   #include <linux/magic.h>
>> +#include <linux/user_namespace.h>
>>   
>>   static struct vfsmount *securityfs_mount;
>>   static int securityfs_mount_count;
>> @@ -73,6 +74,61 @@ static struct file_system_type securityfs_type = {
>>   	.kill_sb =	kill_litter_super,
>>   };
>>   
>> +static int securityfs_ns_fill_super(struct super_block *sb, struct fs_context *fc)
>> +{
>> +	static const struct tree_descr files[] = {{""}};
>> +	int error;
>> +
>> +	error = simple_fill_super(sb, SECURITYFS_NS_MAGIC, files);
>> +	if (error)
>> +		return error;
>> +
>> +	sb->s_op = &securityfs_super_operations;
>> +
>> +	return 0;
>> +}
>> +
>> +static int securityfs_ns_get_tree(struct fs_context *fc)
>> +{
>> +	return get_tree_keyed(fc, securityfs_ns_fill_super, fc->user_ns);
>> +}
>> +
>> +static const struct fs_context_operations securityfs_ns_context_ops = {
>> +	.get_tree	= securityfs_ns_get_tree,
>> +};
>> +
>> +static int securityfs_ns_init_fs_context(struct fs_context *fc)
>> +{
>> +	fc->ops = &securityfs_ns_context_ops;
>> +	return 0;
>> +}
>> +
>> +static struct file_system_type securityfs_ns_type = {
>> +	.owner			= THIS_MODULE,
>> +	.name			= "securityfs_ns",
>> +	.init_fs_context	= securityfs_ns_init_fs_context,
>> +	.kill_sb		= kill_litter_super,
>> +	.fs_flags		= FS_USERNS_MOUNT,
>> +};
>> +
>> +struct vfsmount *securityfs_ns_create_mount(struct user_namespace *user_ns)
>> +{
>> +	struct fs_context *fc;
>> +	struct vfsmount *mnt;
>> +
>> +	fc = fs_context_for_mount(&securityfs_ns_type, SB_KERNMOUNT);
>> +	if (IS_ERR(fc))
>> +		return ERR_CAST(fc);
>> +
>> +	put_user_ns(fc->user_ns);
>> +	fc->user_ns = get_user_ns(user_ns);
>> +
>> +	mnt = fc_mount(fc);
>> +	put_fs_context(fc);
>> +	return mnt;
>> +}
>> +
>> +
>>   /**
>>    * securityfs_create_dentry - create a dentry in the securityfs filesystem
>>    *
>> @@ -155,8 +211,8 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>>   	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
>>   	inode->i_private = data;
>>   	if (S_ISDIR(mode)) {
>> -		inode->i_op = &simple_dir_inode_operations;
>> -		inode->i_fop = &simple_dir_operations;
>> +		inode->i_op = iops ? iops : &simple_dir_inode_operations;
>> +		inode->i_fop = fops ? fops : &simple_dir_operations;
>>   		inc_nlink(inode);
>>   		inc_nlink(dir);
>>   	} else if (S_ISLNK(mode)) {
>> @@ -214,6 +270,41 @@ struct dentry *securityfs_create_file(const char *name, umode_t mode,
>>   }
>>   EXPORT_SYMBOL_GPL(securityfs_create_file);
>>   
>> +/**
>> + * securityfs_ns_create_file - create a file in the securityfs_ns filesystem
>> + *
>> + * @name: a pointer to a string containing the name of the file to create.
>> + * @mode: the permission that the file should have
>> + * @parent: a pointer to the parent dentry for this file.  This should be a
>> + *          directory dentry if set.  If this parameter is %NULL, then the
>> + *          file will be created in the root of the securityfs_ns filesystem.
>> + * @data: a pointer to something that the caller will want to get to later
>> + *        on.  The inode.i_private pointer will point to this value on
>> + *        the open() call.
>> + * @fops: a pointer to a struct file_operations that should be used for
>> + *        this file.
>> + * @mount: Pointer to a pointer of a an existing vfsmount
>> + * @mount_count: The mount_count that goes along with the @mount
>> + *
>> + * This function creates a file in securityfs_ns with the given @name.
>> + *
>> + * This function returns a pointer to a dentry if it succeeds.  This
>> + * pointer must be passed to the securityfs_ns_remove() function when the file
>> + * is to be removed (no automatic cleanup happens if your module is unloaded,
>> + * you are responsible here).  If an error occurs, the function will return
>> + * the error value (via ERR_PTR).
>> + */
>> +struct dentry *securityfs_ns_create_file(const char *name, umode_t mode,
>> +					 struct dentry *parent, void *data,
>> +					 const struct file_operations *fops,
>> +					 const struct inode_operations *iops,
>> +					 struct vfsmount **mount, int *mount_count)
>> +{
>> +	return securityfs_create_dentry(name, mode, parent, data, fops, iops,
>> +					&securityfs_ns_type, mount, mount_count);
>> +}
>> +EXPORT_SYMBOL_GPL(securityfs_ns_create_file);
>> +
>>   /**
>>    * securityfs_create_dir - create a directory in the securityfs filesystem
>>    *
>> @@ -240,6 +331,34 @@ struct dentry *securityfs_create_dir(const char *name, struct dentry *parent)
>>   }
>>   EXPORT_SYMBOL_GPL(securityfs_create_dir);
>>   
>> +/**
>> + * securityfs_ns_create_dir - create a directory in the securityfs_ns filesystem
>> + *
>> + * @name: a pointer to a string containing the name of the directory to
>> + *        create.
>> + * @parent: a pointer to the parent dentry for this file.  This should be a
>> + *          directory dentry if set.  If this parameter is %NULL, then the
>> + *          directory will be created in the root of the securityfs_ns filesystem.
>> + * @mount: Pointer to a pointer of a an existing vfsmount
>> + * @mount_count: The mount_count that goes along with the @mount
>> + *
>> + * This function creates a directory in securityfs_ns with the given @name.
>> + *
>> + * This function returns a pointer to a dentry if it succeeds.  This
>> + * pointer must be passed to the securityfs_ns_remove() function when the file
>> + * is to be removed (no automatic cleanup happens if your module is unloaded,
>> + * you are responsible here).  If an error occurs, the function will return
>> + * the error value (via ERR_PTR).
>> + */
>> +struct dentry *securityfs_ns_create_dir(const char *name, struct dentry *parent,
>> +					const struct inode_operations *iops,
>> +					struct vfsmount **mount, int *mount_count)
>> +{
>> +	return securityfs_ns_create_file(name, S_IFDIR | 0755, parent, NULL, NULL,
>> +					 iops, mount, mount_count);
>> +}
>> +EXPORT_SYMBOL_GPL(securityfs_ns_create_dir);
>> +
>>   struct dentry *_securityfs_create_symlink(const char *name,
>>   					  struct dentry *parent,
>>   					  const char *target,
>> @@ -263,6 +382,7 @@ struct dentry *_securityfs_create_symlink(const char *name,
>>   
>>   	return dent;
>>   }
>> +
>>   /**
>>    * securityfs_create_symlink - create a symlink in the securityfs filesystem
>>    *
>> @@ -300,6 +420,42 @@ struct dentry *securityfs_create_symlink(const char *name,
>>   }
>>   EXPORT_SYMBOL_GPL(securityfs_create_symlink);
>>   
>> +/**
>> + * securityfs_ns_create_symlink - create a symlink in the securityfs_ns filesystem
>> + *
>> + * @name: a pointer to a string containing the name of the symlink to
>> + *        create.
>> + * @parent: a pointer to the parent dentry for the symlink.  This should be a
>> + *          directory dentry if set.  If this parameter is %NULL, then the
>> + *          directory will be created in the root of the securityfs_ns filesystem.
>> + * @target: a pointer to a string containing the name of the symlink's target.
>> + *          If this parameter is %NULL, then the @iops parameter needs to be
>> + *          setup to handle .readlink and .get_link inode_operations.
>> + * @iops: a pointer to the struct inode_operations to use for the symlink. If
>> + *        this parameter is %NULL, then the default simple_symlink_inode
>> + *        operations will be used.
>> + * @mount: Pointer to a pointer of a an existing vfsmount
>> + * @mount_count: The mount_count that goes along with the @mount
>> + *
>> + * This function creates a symlink in securityfs_ns with the given @name.
>> + *
>> + * This function returns a pointer to a dentry if it succeeds.  This
>> + * pointer must be passed to the securityfs_ns_remove() function when the file
>> + * is to be removed (no automatic cleanup happens if your module is unloaded,
>> + * you are responsible here).  If an error occurs, the function will return
>> + * the error value (via ERR_PTR).
>> + */
>> +struct dentry *securityfs_ns_create_symlink(const char *name,
>> +					    struct dentry *parent,
>> +					    const char *target,
>> +					    const struct inode_operations *iops,
>> +					    struct vfsmount **mount, int *mount_count)
>> +{
>> +	return _securityfs_create_symlink(name, parent, target, iops,
>> +					  &securityfs_ns_type, mount, mount_count);
>> +}
>> +EXPORT_SYMBOL_GPL(securityfs_ns_create_symlink);
>> +
>>   void _securityfs_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
>>   {
>>   	struct inode *dir;
>> @@ -340,6 +496,27 @@ void securityfs_remove(struct dentry *dentry)
>>   
>>   EXPORT_SYMBOL_GPL(securityfs_remove);
>>   
>> +/**
>> + * securityfs_ns_remove - removes a file or directory from the securityfs_ns filesystem
>> + *
>> + * @dentry: a pointer to a the dentry of the file or directory to be removed.
>> + * @mount: Pointer to a pointer of a an existing vfsmount
>> + * @mount_count: The mount_count that goes along with the @mount
>> + *
>> + * This function removes a file or directory in securityfs_ns that was previously
>> + * created with a call to another securityfs_ns function (like
>> + * securityfs_ns_create_file() or variants thereof.)
>> + *
>> + * This function is required to be called in order for the file to be
>> + * removed. No automatic cleanup of files will happen when a module is
>> + * removed; you are responsible here.
>> + */
>> +void securityfs_ns_remove(struct dentry *dentry, struct vfsmount **mount, int *mount_count)
>> +{
>> +	_securityfs_remove(dentry, mount, mount_count);
>> +}
>> +EXPORT_SYMBOL_GPL(securityfs_ns_remove);
>> +
>>   #ifdef CONFIG_SECURITY
>>   static struct dentry *lsm_dentry;
>>   static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
>> @@ -364,14 +541,22 @@ static int __init securityfs_init(void)
>>   		return retval;
>>   
>>   	retval = register_filesystem(&securityfs_type);
>> -	if (retval) {
>> -		sysfs_remove_mount_point(kernel_kobj, "security");
>> -		return retval;
>> -	}
>> +	if (retval)
>> +		goto remove_mount;
>> +	retval = register_filesystem(&securityfs_ns_type);
>> +	if (retval)
>> +		goto unregister_filesystem;
> So you're introducing a new filesystem type securityfs_ns. Ithink that's
> simply wrong and feels like a hack. What issues did you run into when
> trying to convert the existing securityfs itself?

I primarily didn't want to touch the existing securityfs with its 
existing users and it being a single instance filesystem. So I though 
I'd create something with a new API just for namespaces that is 
multi-instance capable.


>
> I see no immediate reason why a get_tree_keyed() conversion for
> securityfs wouldn't work even with the debugfs pin/unpin logic in there
> kept for the securityfs mounted in the initial userns.
Ok, let me try to convert securityfs then.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 20/20] ima: Setup securityfs_ns for IMA namespace
  2021-12-02 13:18   ` Christian Brauner
@ 2021-12-02 13:52     ` Stefan Berger
  0 siblings, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-12-02 13:52 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-integrity, zohar, serge, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jejb, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/2/21 08:18, Christian Brauner wrote:
> On Tue, Nov 30, 2021 at 11:06:54AM -0500, Stefan Berger wrote:
>> Setup securityfs_ns with symlinks, directories, and files for IMA
>> namespacing support. The same directory structure that IMA uses on the
>> host is also created for the namespacing case.
>>
>> Increment the user namespace's refcount_teardown value by '1' once
>> securityfs_ns has been successfully setup since the initialization of the
>> filesystem causes an additional reference to the user namespace to be
>> taken. The early teardown function will delete the file system and release
>> the additional reference.
>>
>> The securityfs_ns file and directory ownerships cannot be set when the
>> filesystem is setup since at this point the user namespace has not been
>> configured yet by the user and therefore the ownership mappings are not
>> available, yet. Therefore, adjust the file and directory ownerships when
>> an inode's function for determining the permissions of a file or directory
>> is accessed.
>>
>> This filesystem can now be mounted as follows:
>>
>> mount -t securityfs_ns /sys/kernel/security/ /sys/kernel/security/
>>
>> The following directories, symlinks, and files are then available.
>>
>> $ ls -l sys/kernel/security/
>> total 0
>> lr--r--r--. 1 nobody nobody 0 Nov 27 06:44 ima -> integrity/ima
>> drwxr-xr-x. 3 nobody nobody 0 Nov 27 06:44 integrity
>>
>> $ ls -l sys/kernel/security/ima/
>> total 0
>> -r--r-----. 1 root root 0 Nov 27 06:44 ascii_runtime_measurements
>> -r--r-----. 1 root root 0 Nov 27 06:44 binary_runtime_measurements
>> -rw-------. 1 root root 0 Nov 27 06:44 policy
>> -r--r-----. 1 root root 0 Nov 27 06:44 runtime_measurements_count
>> -r--r-----. 1 root root 0 Nov 27 06:44 violations
>>
>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>> ---
>>   include/linux/ima.h                      |  17 +++
>>   security/integrity/ima/ima.h             |   2 +
>>   security/integrity/ima/ima_fs.c          | 178 ++++++++++++++++++++++-
>>   security/integrity/ima/ima_init_ima_ns.c |   6 +-
>>   security/integrity/ima/ima_ns.c          |   4 +-
>>   5 files changed, 203 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/ima.h b/include/linux/ima.h
>> index fe08919df326..a2c5e516f706 100644
>> --- a/include/linux/ima.h
>> +++ b/include/linux/ima.h
>> @@ -221,6 +221,18 @@ struct ima_h_table {
>>   	struct hlist_head queue[IMA_MEASURE_HTABLE_SIZE];
>>   };
>>   
>> +enum {
>> +	IMAFS_DENTRY_INTEGRITY_DIR = 0,
>> +	IMAFS_DENTRY_DIR,
>> +	IMAFS_DENTRY_SYMLINK,
>> +	IMAFS_DENTRY_BINARY_RUNTIME_MEASUREMENTS,
>> +	IMAFS_DENTRY_ASCII_RUNTIME_MEASUREMENTS,
>> +	IMAFS_DENTRY_RUNTIME_MEASUREMENTS_COUNT,
>> +	IMAFS_DENTRY_VIOLATIONS,
>> +	IMAFS_DENTRY_IMA_POLICY,
>> +	IMAFS_DENTRY_LAST
>> +};
>> +
>>   struct ima_namespace {
>>   	struct kref kref;
>>   	struct user_namespace *user_ns;
>> @@ -267,6 +279,11 @@ struct ima_namespace {
>>   	struct mutex ima_write_mutex;
>>   	unsigned long ima_fs_flags;
>>   	int valid_policy;
>> +
>> +	struct dentry *dentry[IMAFS_DENTRY_LAST];
>> +	struct vfsmount *mount;
>> +	int mount_count;
>> +	bool file_ownership_fixes_done;
>>   };
>>   
>>   extern struct ima_namespace init_ima_ns;
>> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
>> index bb9763cd5fb1..9bcd71bb716c 100644
>> --- a/security/integrity/ima/ima.h
>> +++ b/security/integrity/ima/ima.h
>> @@ -139,6 +139,8 @@ struct ns_status {
>>   /* Internal IMA function definitions */
>>   int ima_init(void);
>>   int ima_fs_init(void);
>> +int ima_fs_ns_init(struct ima_namespace *ns);
>> +void ima_fs_ns_free(struct ima_namespace *ns);
>>   int ima_add_template_entry(struct ima_namespace *ns,
>>   			   struct ima_template_entry *entry, int violation,
>>   			   const char *op, struct inode *inode,
>> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
>> index 6766bb8262f2..9a14be520268 100644
>> --- a/security/integrity/ima/ima_fs.c
>> +++ b/security/integrity/ima/ima_fs.c
>> @@ -22,6 +22,7 @@
>>   #include <linux/parser.h>
>>   #include <linux/vmalloc.h>
>>   #include <linux/ima.h>
>> +#include <linux/namei.h>
>>   
>>   #include "ima.h"
>>   
>> @@ -436,8 +437,13 @@ static int ima_release_policy(struct inode *inode, struct file *file)
>>   
>>   	ima_update_policy(ns);
>>   #if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
>> -	securityfs_remove(ima_policy);
>> -	ima_policy = NULL;
>> +	if (ns == &init_ima_ns) {
>> +		securityfs_remove(ima_policy);
>> +		ima_policy = NULL;
>> +	} else {
>> +		securityfs_ns_remove(ns->dentry[IMAFS_DENTRY_POLICY]);
>> +		ns->dentry[IMAFS_DENTRY_POLICY] = NULL;
>> +	}
>>   #elif defined(CONFIG_IMA_WRITE_POLICY)
>>   	clear_bit(IMA_FS_BUSY, &ns->ima_fs_flags);
>>   #elif defined(CONFIG_IMA_READ_POLICY)
>> @@ -509,3 +515,171 @@ int __init ima_fs_init(void)
>>   	securityfs_remove(ima_policy);
>>   	return -1;
>>   }
>> +
>> +/*
>> + * Fix the ownership (uid/gid) of the dentry's that couldn't be set at the
>> + * time of their creation because the user namespace wasn't configured, yet.
>> + */
>> +static void ima_fs_ns_fixup_uid_gid(struct ima_namespace *ns)
>> +{
>> +	struct inode *inode;
>> +	size_t i;
>> +
>> +	if (ns->file_ownership_fixes_done ||
>> +	    ns->user_ns->uid_map.nr_extents == 0)
>> +		return;
>> +
>> +	ns->file_ownership_fixes_done = true;
>> +	for (i = 0; i < IMAFS_DENTRY_LAST; i++) {
>> +		if (!ns->dentry[i])
>> +			continue;
>> +		inode = ns->dentry[i]->d_inode;
>> +		inode->i_uid = make_kuid(ns->user_ns, 0);
>> +		inode->i_gid = make_kgid(ns->user_ns, 0);
>> +	}
>> +}
>> +
>> +/* Fix the permissions when a file is opened */
>> +int ima_fs_ns_permission(struct user_namespace *mnt_userns, struct inode *inode,
>> +			 int mask)
>> +{
>> +	ima_fs_ns_fixup_uid_gid(get_current_ns());
> As noted later in the thread if this is required it means something is
> buggy in the current code. That shouldn't be needed.
I fixed this yesterday with late initialization: 
https://lkml.org/lkml/2021/12/1/1181
>
> I think there's a more fundamental issue here. The correct way to do all
> this would be to restructure securityfs at least how it works inside of
> user namespaces. Currently, securityfs works like debugfs: a single
> shared superblock that is pinned by each new inode that is created via:
>
> 	simple_pin_fs(&fs_type, &mount, &mount_count);
> 	simple_release_fs(&mount, &mount_count);
>
> and each mount surfaces the same superblock. Ideally making securityfs
> mountable inside of user namespaces should get you a new superblock.
> Functions that create files for the ima ns would then be called inside
> ->fill_super etc.

So this would be the wrong place to do it? I moved it there because this 
is called late (upon mounting) when the configuration of the user 
namespace has completed.

static int securityfs_ns_init_fs_context(struct fs_context *fc)
{
          int rc;

          if (fc->user_ns->ima_ns->late_fs_init) {
                  rc = fc->user_ns->ima_ns->late_fs_init(fc->user_ns);
                  if (rc)
                          return rc;
          }
          fc->ops = &securityfs_ns_context_ops;
          return 0;
}


Stefan




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-02 13:01           ` Christian Brauner
@ 2021-12-02 15:58             ` Casey Schaufler
  0 siblings, 0 replies; 54+ messages in thread
From: Casey Schaufler @ 2021-12-02 15:58 UTC (permalink / raw)
  To: Christian Brauner, James Bottomley
  Cc: Stefan Berger, linux-integrity, zohar, serge, containers,
	dmitry.kasatkin, ebiederm, krzysztof.struczynski, roberto.sassu,
	mpeters, lhinds, lsturman, puiterwi, jamjoom, linux-kernel, paul,
	rgb, linux-security-module, jmorris, Denis Semakin,
	Casey Schaufler

On 12/2/2021 5:01 AM, Christian Brauner wrote:
> On Thu, Dec 02, 2021 at 01:59:55PM +0100, Christian Brauner wrote:
>> On Wed, Dec 01, 2021 at 02:29:09PM -0500, James Bottomley wrote:
>>> On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
>>>> On 12/1/21 11:58, James Bottomley wrote:
>>>>> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>>>>>> From: Denis Semakin <denis.semakin@huawei.com>
>>>>>>
>>>>>> Use integrity_admin_ns_capable() to check corresponding
>>>>>> capability to allow read/write IMA policy without CAP_SYS_ADMIN
>>>>>> but with CAP_INTEGRITY_ADMIN.
>>>>>>
>>>>>> Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
>>>>>> ---
>>>>>>    security/integrity/ima/ima_fs.c | 2 +-
>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/security/integrity/ima/ima_fs.c
>>>>>> b/security/integrity/ima/ima_fs.c
>>>>>> index fd2798f2d224..6766bb8262f2 100644
>>>>>> --- a/security/integrity/ima/ima_fs.c
>>>>>> +++ b/security/integrity/ima/ima_fs.c
>>>>>> @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
>>>>>> *inode,
>>>>>> struct file *filp)
>>>>>>    #else
>>>>>>    		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
>>>>>>    			return -EACCES;
>>>>>> -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
>>>>>> +		if (!integrity_admin_ns_capable(ns->user_ns))
>>>>> so this one is basically replacing what you did in RFC 16/20, which
>>>>> seems a little redundant.
>>>>>
>>>>> The question I'd like to ask is: is there still a reason for
>>>>> needing CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty
>>>>> much tied to requiring a user (and a mount, because of
>>>>> securityfs_ns) namespace, there might not be a pressing need for an
>>>>> admin capability separated from CAP_SYS_ADMIN because the owner of
>>>>> the user namespace passes the ns_capable(..., CAP_SYS_ADMIN)
>>>>> check.  The rationale in
>>>> Casey suggested using CAP_MAC_ADMIN, which I think would also work.
>>>>
>>>>       CAP_MAC_ADMIN (since Linux 2.6.25)
>>>>                 Allow MAC configuration or state changes. Implemented
>>>> for
>>>>                 the Smack Linux Security Module (LSM).
>>>>
>>>>
>>>> Down the road I think we should cover setting file extended
>>>> attributes with the same capability as well for when a user signs
>>>> files or installs packages with file signatures.  A container runtime
>>>> could hold CAP_SYS_ADMIN while setting up a container and mounting
>>>> filesystems and drop it for the first process started there. Since we
>>>> are using the user namespace to spawn an IMA namespace, we would then
>>>> require CAP_SYSTEM_ADMIN to be left available so that the user can do
>>>> IMA related stuff in the container (set or append to the policy,
>>>> write file signatures). I am not sure whether that should be the case
>>>> or rather give the user something finer grained, such as
>>>> CAP_MAC_ADMIN. So, it's about granularity...

The important rationale for capabilities is separation
of privilege from user id. Granularity has always been a
contentious issue. Whether you use CAP_SYS_ADMIN or CAP_MAC_ADMIN
you are using privilege, and need to be diligent.

>>> It's possible ... any orchestration system that doesn't enter a user
>>> namespace has to strictly regulate capabilities.   I'm probably biased
>>> because I always use a user_ns so I never really had to mess with
>>> capabilities.
>>>
>>>>> https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
>>>>>
>>>>> Is effectively "because CAP_SYS_ADMIN is too powerful" but that's
>>>>> no longer true of the user namespace owner.  It only passes the
>>>>> ns_capable() check not the capable() one, so while it does get
>>>>> CAP_SYS_ADMIN, it can only use it in a few situations which
>>>>> represent quite a power reduction already.
>>>> At least docker containers drop CAP_SYS_ADMIN.
>>> Well docker doesn't use the user_ns.  But even given that,
>>> CAP_SYS_ADMIN is always dropped for most container systems.  What
>>> happens when you enter a user namespace is the ns_capable( ...,
>>> CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns,
>>> in the same way it would for root.  So effectively entering a user
>>> namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what
>>> unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that
>>> responds only in the places in the kernel that have a ns_capable()
>>> check instead of a capable() one (most of the places you list below).
>>> This is the principle of how unprivileged containers actually work ...
>>> and the source of some of our security problems if you get back an
>>> ability to do something you shouldn't be allowed to do as an
>>> unprivileged user.
>>>
>>>>   I am not sure what the decision was based on but probably they don't
>>>> want to give the user what is not absolutely necessary, but usage of
>>>> user namespaces (with IMA namespaces) would kind of force it to be
>>>> available then to do IMA-related stuff ...
>>>>
>>>> Following this man page here
>>>> https://man7.org/linux/man-pages/man7/user_namespaces.7.html
>>>>
>>>> CAP_SYS_ADMIN in a user namespace is about
>>>>
>>>> - bind-mounting filesystems
>>>>
>>>> - mounting /proc filesystems
>>>>
>>>> - creating nested user namespaces
>>>>
>>>> - configuring UTS namespace
>>>>
>>>> - configuring whether setgroups() can be used
>>>>
>>>> - usage of setns()
>>>>
>>>>
>>>> Do we want to add '- only way of *setting up* IMA related stuff' to
>>>> this list?
>>> I don't see why not, but other container people should weigh in
>>> because, as I said, I mostly use the user namespace and unprivileged
>>> containers and don't bother with capabilities.
>> There are very few scenarios where dropping capabilities in an
>> unprivileged container makes sense. In a lot of other scenarios it is
>> just a misunderstanding of the meaning of capabilities and their
>> relationship to user namespaces. Usually, granting a full set of
>> capabilities to the payload of an unprivigileged container is the right
>> thing to do. All things that are properly namespaced will check
>> capabilities in the relevant user namespace. Those that aren't will
>> check them against the initial user namespaces.
>>
>> But I do think the question of whether or not ima should go into
>> cap_sys_admin is more a question of capability semantics then it is in
>> how exactly ima is namespaced. We do have agreed before that overloading
>> cap_sys_admin further isn't ideal. Often we end up rectifying that
>> mistake later. For example, how we moved stuff like criu, bpf, and perf
>> to their own capability. Now we're left with stuff like:
>>
>> static inline bool perfmon_capable(void)
>> {
>> 	return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
>> }
>>
>> static inline bool bpf_capable(void)
>> {
>> 	return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
>> }
>>
>> static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
>> {
>> 	return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
>> 		ns_capable(ns, CAP_SYS_ADMIN);
>> }
>>
>> for the sake of adhering to legacy behavior. I think we can skip over
>> that mistake and introduce cap_sys_integrity.
> (Or under CAP_MAC_ADMIN as suggested elsewhere in the thread as I saw
> just now.)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-12-02 13:41     ` Stefan Berger
@ 2021-12-02 16:29       ` James Bottomley
  2021-12-02 16:45         ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-02 16:29 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Thu, 2021-12-02 at 08:41 -0500, Stefan Berger wrote:
> On 12/2/21 07:46, James Bottomley wrote:
> > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > Move measurement list related variables into the ima_namespace.
> > > This
> > > way a
> > > front-end like SecurityFS can show the measurement list inside an
> > > IMA
> > > namespace.
> > > 
> > > Implement ima_free_measurements() to free a list of measurements
> > > and call it when an IMA namespace is deleted.
> > This one worries me quite a lot.  What seems to be happening in
> > this
> > code:
> > 
> > > @@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct
> > > ima_namespace *ns,
> > >          qe->entry = entry;
> > >   
> > >          INIT_LIST_HEAD(&qe->later);
> > > -       list_add_tail_rcu(&qe->later, &ima_measurements);
> > > +       list_add_tail_rcu(&qe->later, &ns->ima_measurements);
> > >   
> > >          atomic_long_inc(&ns->ima_htable.len);
> > >          if (update_htable) {
> > > 
> > is that we now only add the measurements to the namespace list, but
> > that list is freed when the namespace dies.  However, the
> > measurement
> > is still extended through the PCRs meaning we have incomplete
> > information for a replay after the namespace dies?
> 
> *Not at all.* The measurement list of the namespace is independent
> of 
> the host.
> 
> The cover letter states:

I get that the host can set up a policy to log everything in the
namespace, but that wasn't my question.  My question is can the guest
set up a policy to log something that doesn't go into the host log
(because the host hasn't asked for it to be logged) but extends a PCR
anyway, thus destroying the ability of the host to do log replay.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-12-02 16:29       ` James Bottomley
@ 2021-12-02 16:45         ` Stefan Berger
  2021-12-02 17:44           ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-02 16:45 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/2/21 11:29, James Bottomley wrote:
> On Thu, 2021-12-02 at 08:41 -0500, Stefan Berger wrote:
>> On 12/2/21 07:46, James Bottomley wrote:
>>> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>>>> Move measurement list related variables into the ima_namespace.
>>>> This
>>>> way a
>>>> front-end like SecurityFS can show the measurement list inside an
>>>> IMA
>>>> namespace.
>>>>
>>>> Implement ima_free_measurements() to free a list of measurements
>>>> and call it when an IMA namespace is deleted.
>>> This one worries me quite a lot.  What seems to be happening in
>>> this
>>> code:
>>>
>>>> @@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct
>>>> ima_namespace *ns,
>>>>           qe->entry = entry;
>>>>    
>>>>           INIT_LIST_HEAD(&qe->later);
>>>> -       list_add_tail_rcu(&qe->later, &ima_measurements);
>>>> +       list_add_tail_rcu(&qe->later, &ns->ima_measurements);
>>>>    
>>>>           atomic_long_inc(&ns->ima_htable.len);
>>>>           if (update_htable) {
>>>>
>>> is that we now only add the measurements to the namespace list, but
>>> that list is freed when the namespace dies.  However, the
>>> measurement
>>> is still extended through the PCRs meaning we have incomplete
>>> information for a replay after the namespace dies?
>> *Not at all.* The measurement list of the namespace is independent
>> of
>> the host.
>>
>> The cover letter states:
> I get that the host can set up a policy to log everything in the
> namespace, but that wasn't my question.  My question is can the guest
> set up a policy to log something that doesn't go into the host log
> (because the host hasn't asked for it to be logged) but extends a PCR
> anyway, thus destroying the ability of the host to do log replay.


host log goes with host TPM and vice versa

guest log goes with (optional) vTPM and vice version

Extending the PCR of the host's TPM would require the data to be logged 
in the host log as well. So, no, it's not possible.


    Stefan


>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-12-02 16:45         ` Stefan Berger
@ 2021-12-02 17:44           ` James Bottomley
  2021-12-02 18:03             ` Stefan Berger
  0 siblings, 1 reply; 54+ messages in thread
From: James Bottomley @ 2021-12-02 17:44 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Thu, 2021-12-02 at 11:45 -0500, Stefan Berger wrote:
> On 12/2/21 11:29, James Bottomley wrote:
> > On Thu, 2021-12-02 at 08:41 -0500, Stefan Berger wrote:
> > > On 12/2/21 07:46, James Bottomley wrote:
> > > > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > > > Move measurement list related variables into the
> > > > > ima_namespace.
> > > > > This
> > > > > way a
> > > > > front-end like SecurityFS can show the measurement list
> > > > > inside an
> > > > > IMA
> > > > > namespace.
> > > > > 
> > > > > Implement ima_free_measurements() to free a list of
> > > > > measurements
> > > > > and call it when an IMA namespace is deleted.
> > > > This one worries me quite a lot.  What seems to be happening in
> > > > this
> > > > code:
> > > > 
> > > > > @@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct
> > > > > ima_namespace *ns,
> > > > >           qe->entry = entry;
> > > > >    
> > > > >           INIT_LIST_HEAD(&qe->later);
> > > > > -       list_add_tail_rcu(&qe->later, &ima_measurements);
> > > > > +       list_add_tail_rcu(&qe->later, &ns->ima_measurements);
> > > > >    
> > > > >           atomic_long_inc(&ns->ima_htable.len);
> > > > >           if (update_htable) {
> > > > > 
> > > > is that we now only add the measurements to the namespace list,
> > > > but
> > > > that list is freed when the namespace dies.  However, the
> > > > measurement
> > > > is still extended through the PCRs meaning we have incomplete
> > > > information for a replay after the namespace dies?
> > > *Not at all.* The measurement list of the namespace is
> > > independent
> > > of
> > > the host.
> > > 
> > > The cover letter states:
> > I get that the host can set up a policy to log everything in the
> > namespace, but that wasn't my question.  My question is can the
> > guest
> > set up a policy to log something that doesn't go into the host log
> > (because the host hasn't asked for it to be logged) but extends a
> > PCR
> > anyway, thus destroying the ability of the host to do log replay.
> 
> host log goes with host TPM and vice versa
> 
> guest log goes with (optional) vTPM and vice version

But that's what doesn't seem to happen ... ima_pcr_extend isn't
virtualized and it's always called from ima_add_template_entry()
meaning the physical TPM is always extended even for a namespace only
entry.

> Extending the PCR of the host's TPM would require the data to be
> logged in the host log as well. So, no, it's not possible.

Well, exactly: if you don't have or want a vTPM per container the only
way to attest is via the physical TPM which means all entries in the
namespace must be in the host log, so the host owner can quote and
reply and they can split the attested log and give assurance to the
namespaces that their entries are correct.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
  2021-12-02  7:16         ` Denis Semakin
  2021-12-02 12:33           ` James Bottomley
@ 2021-12-02 17:54           ` Stefan Berger
  1 sibling, 0 replies; 54+ messages in thread
From: Stefan Berger @ 2021-12-02 17:54 UTC (permalink / raw)
  To: Denis Semakin, jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, Krzysztof Struczynski, Roberto Sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/2/21 02:16, Denis Semakin wrote:
> Obviously the main goal by adding new capability was to avoid the using CAP_SYS_ADMIN (IOW superuser)
> to manage IMA stuff, that was also about security granularity.  It's good if CAP_MAC_ADMIN will be enough for doing IMA related things (write policies and extended attributes).
> But for me it's a little bit unclear how to deal with unprivileged users: assuming there's no CAP_INTEGRITY_ADMIN but CAP_MAC_ADMIN was set up, so in this case user can control any LSM (seLinux, SMACK, etc) and IMA (policies, xattrs). What if .. for some systems there would be some requirements that will allow to touch LSM but do not change any IMA (integrity) things? A user can set up any IMA policy (it's about the system integrity), modify IMA related xattrs but it's forbidden to change seLinux policies and e.g. SMACK labels... May be it's unreal scenario of course... but I guess it's not 100% impossible.

If we can introduce a new capability I would use CAP_INTEGRITY_ADMIN, if 
not CAP_MAC_ADMIN.


    Stefan

>
> Best regards,
> Denis
>
>
> -----Original Message-----
> From: James Bottomley [mailto:jejb@linux.ibm.com]
> Sent: Wednesday, December 1, 2021 10:29 PM
> To: Stefan Berger <stefanb@linux.ibm.com>; linux-integrity@vger.kernel.org
> Cc: zohar@linux.ibm.com; serge@hallyn.com; christian.brauner@ubuntu.com; containers@lists.linux.dev; dmitry.kasatkin@gmail.com; ebiederm@xmission.com; Krzysztof Struczynski <krzysztof.struczynski@huawei.com>; Roberto Sassu <roberto.sassu@huawei.com>; mpeters@redhat.com; lhinds@redhat.com; lsturman@redhat.com; puiterwi@redhat.com; jamjoom@us.ibm.com; linux-kernel@vger.kernel.org; paul@paul-moore.com; rgb@redhat.com; linux-security-module@vger.kernel.org; jmorris@namei.org; Denis Semakin <denis.semakin@huawei.com>
> Subject: Re: [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability
>
> On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
>> On 12/1/21 11:58, James Bottomley wrote:
>>> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>>>> From: Denis Semakin <denis.semakin@huawei.com>
>>>>
>>>> Use integrity_admin_ns_capable() to check corresponding capability
>>>> to allow read/write IMA policy without CAP_SYS_ADMIN but with
>>>> CAP_INTEGRITY_ADMIN.
>>>>
>>>> Signed-off-by: Denis Semakin <denis.semakin@huawei.com>
>>>> ---
>>>>    security/integrity/ima/ima_fs.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/security/integrity/ima/ima_fs.c
>>>> b/security/integrity/ima/ima_fs.c index fd2798f2d224..6766bb8262f2
>>>> 100644
>>>> --- a/security/integrity/ima/ima_fs.c
>>>> +++ b/security/integrity/ima/ima_fs.c
>>>> @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
>>>> *inode, struct file *filp)
>>>>    #else
>>>>    		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
>>>>    			return -EACCES;
>>>> -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
>>>> +		if (!integrity_admin_ns_capable(ns->user_ns))
>>> so this one is basically replacing what you did in RFC 16/20, which
>>> seems a little redundant.
>>>
>>> The question I'd like to ask is: is there still a reason for needing
>>> CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty much
>>> tied to requiring a user (and a mount, because of
>>> securityfs_ns) namespace, there might not be a pressing need for an
>>> admin capability separated from CAP_SYS_ADMIN because the owner of
>>> the user namespace passes the ns_capable(..., CAP_SYS_ADMIN) check.
>>> The rationale in
>> Casey suggested using CAP_MAC_ADMIN, which I think would also work.
>>
>>       CAP_MAC_ADMIN (since Linux 2.6.25)
>>                 Allow MAC configuration or state changes. Implemented
>> for
>>                 the Smack Linux Security Module (LSM).
>>
>>
>> Down the road I think we should cover setting file extended attributes
>> with the same capability as well for when a user signs files or
>> installs packages with file signatures.  A container runtime could
>> hold CAP_SYS_ADMIN while setting up a container and mounting
>> filesystems and drop it for the first process started there. Since we
>> are using the user namespace to spawn an IMA namespace, we would then
>> require CAP_SYSTEM_ADMIN to be left available so that the user can do
>> IMA related stuff in the container (set or append to the policy, write
>> file signatures). I am not sure whether that should be the case or
>> rather give the user something finer grained, such as CAP_MAC_ADMIN.
>> So, it's about granularity...
> It's possible ... any orchestration system that doesn't enter a user
> namespace has to strictly regulate capabilities.   I'm probably biased
> because I always use a user_ns so I never really had to mess with capabilities.
>
>>> https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerat
>>> ions
>>>
>>> Is effectively "because CAP_SYS_ADMIN is too powerful" but that's no
>>> longer true of the user namespace owner.  It only passes the
>>> ns_capable() check not the capable() one, so while it does get
>>> CAP_SYS_ADMIN, it can only use it in a few situations which
>>> represent quite a power reduction already.
>> At least docker containers drop CAP_SYS_ADMIN.
> Well docker doesn't use the user_ns.  But even given that, CAP_SYS_ADMIN is always dropped for most container systems.  What happens when you enter a user namespace is the ns_capable( ...,
> CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns, in the same way it would for root.  So effectively entering a user namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that responds only in the places in the kernel that have a ns_capable() check instead of a capable() one (most of the places you list below).
> This is the principle of how unprivileged containers actually work ...
> and the source of some of our security problems if you get back an ability to do something you shouldn't be allowed to do as an unprivileged user.
>
>>   I am not sure what the decision was based on but probably they don't
>> want to give the user what is not absolutely necessary, but usage of
>> user namespaces (with IMA namespaces) would kind of force it to be
>> available then to do IMA-related stuff ...
>>
>> Following this man page here
>> https://man7.org/linux/man-pages/man7/user_namespaces.7.html
>>
>> CAP_SYS_ADMIN in a user namespace is about
>>
>> - bind-mounting filesystems
>>
>> - mounting /proc filesystems
>>
>> - creating nested user namespaces
>>
>> - configuring UTS namespace
>>
>> - configuring whether setgroups() can be used
>>
>> - usage of setns()
>>
>>
>> Do we want to add '- only way of *setting up* IMA related stuff' to
>> this list?
> I don't see why not, but other container people should weigh in because, as I said, I mostly use the user namespace and unprivileged containers and don't bother with capabilities.
>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-12-02 17:44           ` James Bottomley
@ 2021-12-02 18:03             ` Stefan Berger
  2021-12-02 20:03               ` James Bottomley
  0 siblings, 1 reply; 54+ messages in thread
From: Stefan Berger @ 2021-12-02 18:03 UTC (permalink / raw)
  To: jejb, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris


On 12/2/21 12:44, James Bottomley wrote:
> On Thu, 2021-12-02 at 11:45 -0500, Stefan Berger wrote:
>> On 12/2/21 11:29, James Bottomley wrote:
>>> On Thu, 2021-12-02 at 08:41 -0500, Stefan Berger wrote:
>>>> On 12/2/21 07:46, James Bottomley wrote:
>>>>> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>>>>>> Move measurement list related variables into the
>>>>>> ima_namespace.
>>>>>> This
>>>>>> way a
>>>>>> front-end like SecurityFS can show the measurement list
>>>>>> inside an
>>>>>> IMA
>>>>>> namespace.
>>>>>>
>>>>>> Implement ima_free_measurements() to free a list of
>>>>>> measurements
>>>>>> and call it when an IMA namespace is deleted.
>>>>> This one worries me quite a lot.  What seems to be happening in
>>>>> this
>>>>> code:
>>>>>
>>>>>> @@ -107,7 +100,7 @@ static int ima_add_digest_entry(struct
>>>>>> ima_namespace *ns,
>>>>>>            qe->entry = entry;
>>>>>>     
>>>>>>            INIT_LIST_HEAD(&qe->later);
>>>>>> -       list_add_tail_rcu(&qe->later, &ima_measurements);
>>>>>> +       list_add_tail_rcu(&qe->later, &ns->ima_measurements);
>>>>>>     
>>>>>>            atomic_long_inc(&ns->ima_htable.len);
>>>>>>            if (update_htable) {
>>>>>>
>>>>> is that we now only add the measurements to the namespace list,
>>>>> but
>>>>> that list is freed when the namespace dies.  However, the
>>>>> measurement
>>>>> is still extended through the PCRs meaning we have incomplete
>>>>> information for a replay after the namespace dies?
>>>> *Not at all.* The measurement list of the namespace is
>>>> independent
>>>> of
>>>> the host.
>>>>
>>>> The cover letter states:
>>> I get that the host can set up a policy to log everything in the
>>> namespace, but that wasn't my question.  My question is can the
>>> guest
>>> set up a policy to log something that doesn't go into the host log
>>> (because the host hasn't asked for it to be logged) but extends a
>>> PCR
>>> anyway, thus destroying the ability of the host to do log replay.
>> host log goes with host TPM and vice versa
>>
>> guest log goes with (optional) vTPM and vice version
> But that's what doesn't seem to happen ... ima_pcr_extend isn't
> virtualized and it's always called from ima_add_template_entry()
> meaning the physical TPM is always extended even for a namespace only
> entry.

You cannot set a measurement rule in the namespace. That is prevented 
per 9/20: ima: Only accept AUDIT rules for IMA non-init_ima_ns 
namespaces for now.

Also, with the tests that I have done with IMA namespaces I have not 
seen any 'evmctl ima_measurement ...' failures.

Have you been able to cause the IMA namespace to do measurements? It 
would be an easy thing to move the tpm_chip into the ima_namespace as 
well, but per 9/20 this shouldn't be necessary at this point.

>   
>
>> Extending the PCR of the host's TPM would require the data to be
>> logged in the host log as well. So, no, it's not possible.
> Well, exactly: if you don't have or want a vTPM per container the only
> way to attest is via the physical TPM which means all entries in the
> namespace must be in the host log, so the host owner can quote and
> reply and they can split the attested log and give assurance to the
> namespaces that their entries are correct.

Yes, this series allows you to log into the system log and along with 
this extend the TPM PCR.


>
> James
>
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC 08/20] ima: Move measurement list related variables into ima_namespace
  2021-12-02 18:03             ` Stefan Berger
@ 2021-12-02 20:03               ` James Bottomley
  0 siblings, 0 replies; 54+ messages in thread
From: James Bottomley @ 2021-12-02 20:03 UTC (permalink / raw)
  To: Stefan Berger, linux-integrity
  Cc: zohar, serge, christian.brauner, containers, dmitry.kasatkin,
	ebiederm, krzysztof.struczynski, roberto.sassu, mpeters, lhinds,
	lsturman, puiterwi, jamjoom, linux-kernel, paul, rgb,
	linux-security-module, jmorris

On Thu, 2021-12-02 at 13:03 -0500, Stefan Berger wrote:
> On 12/2/21 12:44, James Bottomley wrote:
> > On Thu, 2021-12-02 at 11:45 -0500, Stefan Berger wrote:
> > > On 12/2/21 11:29, James Bottomley wrote:
> > > > On Thu, 2021-12-02 at 08:41 -0500, Stefan Berger wrote:
> > > > > On 12/2/21 07:46, James Bottomley wrote:
> > > > > > On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
> > > > > > > Move measurement list related variables into the
> > > > > > > ima_namespace. This way a front-end like SecurityFS can
> > > > > > > show the measurement list inside an IMA
> > > > > > > namespace.
> > > > > > > 
> > > > > > > Implement ima_free_measurements() to free a list of
> > > > > > > measurements and call it when an IMA namespace is
> > > > > > > deleted.
> > > > > > This one worries me quite a lot.  What seems to be
> > > > > > happening in this code:
> > > > > > 
> > > > > > > @@ -107,7 +100,7 @@ static int
> > > > > > > ima_add_digest_entry(struct
> > > > > > > ima_namespace *ns,
> > > > > > >            qe->entry = entry;
> > > > > > >     
> > > > > > >            INIT_LIST_HEAD(&qe->later);
> > > > > > > -       list_add_tail_rcu(&qe->later, &ima_measurements);
> > > > > > > +       list_add_tail_rcu(&qe->later, &ns-
> > > > > > > >ima_measurements);
> > > > > > >     
> > > > > > >            atomic_long_inc(&ns->ima_htable.len);
> > > > > > >            if (update_htable) {
> > > > > > > 
> > > > > > is that we now only add the measurements to the namespace
> > > > > > list, but that list is freed when the namespace
> > > > > > dies.  However, the measurement is still extended through
> > > > > > the PCRs meaning we have incomplete information for a
> > > > > > replay after the namespace dies?
> > > > > *Not at all.* The measurement list of the namespace is
> > > > > independent of the host.
> > > > > 
> > > > > The cover letter states:
> > > > I get that the host can set up a policy to log everything in
> > > > the namespace, but that wasn't my question.  My question is can
> > > > the guest set up a policy to log something that doesn't go into
> > > > the host log (because the host hasn't asked for it to be
> > > > logged) but extends a PCR anyway, thus destroying the ability
> > > > of the host to do log replay.
> > > host log goes with host TPM and vice versa
> > > 
> > > guest log goes with (optional) vTPM and vice version
> > But that's what doesn't seem to happen ... ima_pcr_extend isn't
> > virtualized and it's always called from ima_add_template_entry()
> > meaning the physical TPM is always extended even for a namespace
> > only entry.
> 
> You cannot set a measurement rule in the namespace. That is
> prevented per 9/20: ima: Only accept AUDIT rules for IMA non-
> init_ima_ns namespaces for now.

Ah, OK, so the answer is nothing ever traverses this code for the non-
root namespace, so no measurement ever get logged inside a namespace.
Got it.

James



^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2021-12-02 20:04 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-30 16:06 [RFC 00/20] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
2021-11-30 16:06 ` [RFC 01/20] ima: Add IMA namespace support Stefan Berger
2021-11-30 16:06 ` [RFC 02/20] ima: Define ns_status for storing namespaced iint data Stefan Berger
2021-11-30 16:06 ` [RFC 03/20] ima: Namespace audit status flags Stefan Berger
2021-11-30 16:06 ` [RFC 04/20] ima: Move delayed work queue and variables into ima_namespace Stefan Berger
2021-11-30 16:06 ` [RFC 05/20] ima: Move IMA's keys queue related " Stefan Berger
2021-11-30 16:06 ` [RFC 06/20] ima: Move policy " Stefan Berger
2021-11-30 16:06 ` [RFC 07/20] ima: Move ima_htable " Stefan Berger
2021-11-30 16:06 ` [RFC 08/20] ima: Move measurement list related variables " Stefan Berger
2021-12-02 12:46   ` James Bottomley
2021-12-02 13:41     ` Stefan Berger
2021-12-02 16:29       ` James Bottomley
2021-12-02 16:45         ` Stefan Berger
2021-12-02 17:44           ` James Bottomley
2021-12-02 18:03             ` Stefan Berger
2021-12-02 20:03               ` James Bottomley
2021-11-30 16:06 ` [RFC 09/20] ima: Only accept AUDIT rules for IMA non-init_ima_ns namespaces for now Stefan Berger
2021-11-30 16:06 ` [RFC 10/20] ima: Implement hierarchical processing of file accesses Stefan Berger
2021-11-30 16:06 ` [RFC 11/20] securityfs: Prefix global variables with securityfs_ Stefan Berger
2021-11-30 16:06 ` [RFC 12/20] securityfs: Pass static variables as parameters from top level functions Stefan Berger
2021-11-30 16:06 ` [RFC 13/20] securityfs: Build securityfs_ns for namespacing support Stefan Berger
2021-12-02 13:35   ` Christian Brauner
2021-12-02 13:47     ` Stefan Berger
2021-11-30 16:06 ` [RFC 14/20] ima: Move some IMA policy and filesystem related variables into ima_namespace Stefan Berger
2021-11-30 16:06 ` [RFC 15/20] capabilities: Introduce CAP_INTEGRITY_ADMIN Stefan Berger
2021-11-30 17:27   ` Casey Schaufler
2021-11-30 17:41     ` Stefan Berger
2021-11-30 17:50       ` Casey Schaufler
2021-11-30 16:06 ` [RFC 16/20] ima: Use ns_capable() for namespace policy access Stefan Berger
2021-11-30 16:06 ` [RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability Stefan Berger
2021-12-01 16:58   ` James Bottomley
2021-12-01 17:35     ` Stefan Berger
2021-12-01 19:29       ` James Bottomley
2021-12-02  7:16         ` Denis Semakin
2021-12-02 12:33           ` James Bottomley
2021-12-02 17:54           ` Stefan Berger
2021-12-02 12:59         ` Christian Brauner
2021-12-02 13:01           ` Christian Brauner
2021-12-02 15:58             ` Casey Schaufler
2021-11-30 16:06 ` [RFC 18/20] userns: Introduce a refcount variable for calling early teardown function Stefan Berger
2021-11-30 16:06 ` [RFC 19/20] ima/userns: Define early teardown function for IMA namespace Stefan Berger
2021-11-30 16:06 ` [RFC 20/20] ima: Setup securityfs_ns " Stefan Berger
2021-12-01 17:56   ` James Bottomley
2021-12-01 18:11     ` Stefan Berger
2021-12-01 19:21       ` James Bottomley
2021-12-01 20:25         ` Stefan Berger
2021-12-01 21:11           ` James Bottomley
2021-12-01 21:34             ` Stefan Berger
2021-12-01 22:01               ` James Bottomley
2021-12-01 22:09                 ` Stefan Berger
2021-12-01 22:19                   ` James Bottomley
2021-12-02  0:02                     ` Stefan Berger
2021-12-02 13:18   ` Christian Brauner
2021-12-02 13:52     ` Stefan Berger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).