All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] Namespace IMA
@ 2021-11-27 16:45 James Bottomley
  2021-11-27 16:45 ` [RFC 1/3] userns: add uuid field James Bottomley
                   ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-27 16:45 UTC (permalink / raw)
  To: linux-integrity
  Cc: containers, Mimi Zohar, Dmitry Kasatkin, Stefan Berger,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Serge E . Hallyn, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

Over the past five years there have been several attempts to namespace
IMA [1,2,3].  All of them were eventually fairly huge patch series,
which try to solve every issue and potential issue all at once, making
them incredibly difficult to review and highly dependent on an array
of non-IMA features which causes huge upporting difficulty as the
patch progresses.  Given this, I thought we'd try a different way:
introduce a minimal namespacing of IMA and try to build on it in
subsequent patches.

This first patch set namespaces IMA by tying it to the user namespace.
The reason for this is that IMA, at some point, is going to need to do
appraisal in containers and thus ima namespaces are going to require
.ima or _ima keyrings.  Since they keyrings are already namespaced
inside user_ns, adding keyring functionality to a different namespace
is going to be problematic, so I elected to fix the problem by using
the same namespace for IMA and the keyrings.  This decision could
obviously change, but I think given the keyring dependence, we'd need
a good reason to have a different namespace for IMA.

All this patch set does is add a new template 'ima-ns' which includes
the user namespace uuid (added by the first patch) in the IMA log.
Using uuids gives us probabalistically unique identifiers for all
namespaces without having to check them for uniqueness.

Once we have the container being logged, it stands to reason that the
ima inode cache needs to record one event per namespace per inode
instead of the one global event per inode, so if I enter the ima
namespace and execute the same measured command, it will log again
with the new namespace uuid even if the hash is the same:

 > ls
 > grep ls /sys/kernel/security/integrity/ima/ascii_runtime_measurements
10 c70c7b851f83c8c71ee7b508c8468383c0d2c154 ima-ns sha256:1f7f27ef1052e33731c9ff56a36ac3af4437e3f95ad55f6813c320b087b5d356 /usr/bin/ls 6582e360-1354-42b9-a6ef-ee1993d982da
 > unshare --user -r
 # ls
 # exit
 > grep ls /sys/kernel/security/integrity/ima/ascii_runtime_measurements
 10 c70c7b851f83c8c71ee7b508c8468383c0d2c154 ima-ns sha256:1f7f27ef1052e33731c9ff56a36ac3af4437e3f95ad55f6813c320b087b5d356 /usr/bin/ls 6582e360-1354-42b9-a6ef-ee1993d982da
10 144a73d85e9cf999c4abbc99f3c41e9422c8016e ima-ns sha256:1f7f27ef1052e33731c9ff56a36ac3af4437e3f95ad55f6813c320b087b5d356 /usr/bin/ls e496e384-4133-4d57-b93a-1812b83badf2

Note that this namespacing if the iint cache is in the third patch and
could be dropped if there's huge opposition.

Some things to note are that the IMA securityfs entries aren't
virtualized.  This is planned for a follow up patch (so currently the
admin can't even view the ima log in the container).  Everything
that's logged goes through the main IMA log and the physical TPM.
This means that the admin of the physical system can attest to the
log, but the containers would have to trust the admins attestation of
their log pieces.  The initial IMA policy is also inherited from the
physical system and can't currently be changed.

The rough plan of action for follow up patches is

1. Namespace securityfs so container admin can read the IMA files like
   log which would only show entries related to the container (so
   likely only entries of the current and all child user namespaces)
   and policy.

2. Add per namespace policies by writing to the policy file in the
   container.  Obviously implementation of this would have to preserve
   the security of the system, so the new namespace couldn't stop
   logging something the physical host required to be logged, but it
   could add additional classes of things to log.

3. Add the ima keyrings and the ability to appraise inside the container.

There could be other phases beyond this, including possibly optionally
attaching a vtpm to the container to provide local quotes but this
should be need driven.

Some non problems of this approach are:

* The continuous growth of the IMA log.  This is already a problem
  with non-namespaced IMA.  One can argue that the above
  implementation makes the problem worse, but it is unarguable that if
  the problem were solved generally it would no logner be an issue for
  containers.

* attesting to the in-container IMA log.  Given it's being logged
  through the physical TPM, the physical system owner will have to
  publish a mechanism for attesting to particular container entries of
  the log.

[1] https://lore.kernel.org/all/20200818152037.11869-1-krzysztof.struczynski@huawei.com
[2] https://lore.kernel.org/all/20180511144230.75384-1-stefanb@linux.vnet.ibm.com
[3] https://lore.kernel.org/all/1494511203-8397-1-git-send-email-guilherme.magalhaes@hpe.com

James

---

James Bottomley (3):
  userns: add uuid field
  ima: Namespace IMA
  ima: make the integrity inode cache per namespace

 include/linux/ima.h                       |  14 ++-
 include/linux/user_namespace.h            |   7 ++
 kernel/user.c                             |   2 +
 kernel/user_namespace.c                   |   9 ++
 security/integrity/iint.c                 |   4 +-
 security/integrity/ima/Kconfig            |   6 +-
 security/integrity/ima/Makefile           |   2 +-
 security/integrity/ima/ima.h              |  21 +++-
 security/integrity/ima/ima_api.c          |   7 +-
 security/integrity/ima/ima_main.c         |  21 ++--
 security/integrity/ima/ima_ns.c           | 115 ++++++++++++++++++++++
 security/integrity/ima/ima_policy.c       |   2 +-
 security/integrity/ima/ima_template.c     |   6 +-
 security/integrity/ima/ima_template_lib.c |  26 ++++-
 security/integrity/ima/ima_template_lib.h |   4 +
 security/integrity/integrity.h            |  11 ++-
 16 files changed, 236 insertions(+), 21 deletions(-)
 create mode 100644 security/integrity/ima/ima_ns.c

-- 
2.33.0


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 1/3] userns: add uuid field
  2021-11-27 16:45 [RFC 0/3] Namespace IMA James Bottomley
@ 2021-11-27 16:45 ` James Bottomley
  2021-11-28  4:45   ` Serge E. Hallyn
  2021-11-27 16:45 ` [RFC 2/3] ima: Namespace IMA James Bottomley
  2021-11-27 16:45 ` [RFC 3/3] ima: make the integrity inode cache per namespace James Bottomley
  2 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-27 16:45 UTC (permalink / raw)
  To: linux-integrity
  Cc: containers, Mimi Zohar, Dmitry Kasatkin, Stefan Berger,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Serge E . Hallyn, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

As a precursor to namespacing IMA a way of uniquely identifying the
namespace to appear in the IMA log is needed.  This log may be
transported away from the running system and may be analyzed even
after the system has been rebooted.  Thus we need a way of identifying
namespaces in the log which is unique.  UUID, being designed
probabilistically never to repeat, fits this bill so add it to the
user_namespace which we'll also use for namespacing IMA.

uuid_gen() is used to create each uuid uniquely.  It feeds off the
pseudo random number generator, but this should be as unique as we
need for probabilistic non repeats without depleting the entropy
pool.  Since there is no random initializer for a uuid, this is done
in user_namespaces_init().  This should be safe because IMA is a late
initcall.

This patch contains no exposure mechanisms, and the subsequent patches
only add uuid entries in the IMA log.  However, it is not unlikely
that eventually orchestration systems will want to know what the uuid
is (to tie their container ID to the one in the IMA log), so
additional patches exposing this via NSIO and /proc/<pid>/ns could be
added.

For checkpoint/restore, the uuid should not be a property that
transports because otherwise we'll have to have a set mechanism with a
uniqueness check.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 include/linux/user_namespace.h | 2 ++
 kernel/user.c                  | 1 +
 kernel/user_namespace.c        | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 33a4240e6a6f..d155788abdc1 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -10,6 +10,7 @@
 #include <linux/rwsem.h>
 #include <linux/sysctl.h>
 #include <linux/err.h>
+#include <linux/uuid.h>
 
 #define UID_GID_MAP_MAX_BASE_EXTENTS 5
 #define UID_GID_MAP_MAX_EXTENTS 340
@@ -99,6 +100,7 @@ struct user_namespace {
 #endif
 	struct ucounts		*ucounts;
 	long ucount_max[UCOUNT_COUNTS];
+	uuid_t			uuid;
 } __randomize_layout;
 
 struct ucounts {
diff --git a/kernel/user.c b/kernel/user.c
index e2cf8c22b539..bf9ae1d0b670 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -67,6 +67,7 @@ struct user_namespace init_user_ns = {
 	.keyring_name_list = LIST_HEAD_INIT(init_user_ns.keyring_name_list),
 	.keyring_sem = __RWSEM_INITIALIZER(init_user_ns.keyring_sem),
 #endif
+	/* .uuid is initialized in user_namespaces_init() */
 };
 EXPORT_SYMBOL_GPL(init_user_ns);
 
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 6b2e3ca7ee99..8ce57c16ddd3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -141,6 +141,8 @@ int create_user_ns(struct cred *new)
 	if (!setup_userns_sysctls(ns))
 		goto fail_keyring;
 
+	uuid_gen(&ns->uuid);
+
 	set_cred_user_ns(new, ns);
 	return 0;
 fail_keyring:
@@ -1386,6 +1388,7 @@ const struct proc_ns_operations userns_operations = {
 static __init int user_namespaces_init(void)
 {
 	user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC | SLAB_ACCOUNT);
+	uuid_gen(&init_user_ns.uuid);
 	return 0;
 }
 subsys_initcall(user_namespaces_init);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [RFC 2/3] ima: Namespace IMA
  2021-11-27 16:45 [RFC 0/3] Namespace IMA James Bottomley
  2021-11-27 16:45 ` [RFC 1/3] userns: add uuid field James Bottomley
@ 2021-11-27 16:45 ` James Bottomley
  2021-11-29  2:52   ` Serge E. Hallyn
  2021-11-27 16:45 ` [RFC 3/3] ima: make the integrity inode cache per namespace James Bottomley
  2 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-27 16:45 UTC (permalink / raw)
  To: linux-integrity
  Cc: containers, Mimi Zohar, Dmitry Kasatkin, Stefan Berger,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Serge E . Hallyn, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

Use the user namespace as the basis for namespacing IMA.  The
implementation is very simple: add a new template called 'ima-ns'
which exports the uuid of the user namespace into the IMA log.  This
can be used to uniquely separate every container in the IMA log.

Note that the admin of the user namespace still cannot read back the
IMA log because the IMA securityfs entries are not yet namespace
aware.  However, root in the init_user_ns can read the log and see the
containers.

You can get the uuid of init_user_ns from the boot_aggregate entry
which is always the first one recorded in the log.  Any execution with
a different uuid is in a new IMA namespace.

A sample of the log showing entry into a container is:

10 7766c926c9db8dd4c923f96be5635b04593029c1 ima-ns sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate 6582e360-1354-42b9-a6ef-ee1993d982da
[...]
10 e0355132472d4d0ae1cc044412b4033bd5e1a48a ima-ns sha256:353e4d6b807056757fb5df31bafe7df80605bec20b445d5e9afd949ca4147d49 /usr/bin/unshare 6582e360-1354-42b9-a6ef-ee1993d982da
10 f257f5a12fd6e28d32b367a2e453c3badd0e8774 ima-ns sha256:2a7c66fc7e19acc100ee2b777b71179043fade8b81968828522cf31e6a96eaa7 /usr/bin/bash e496e384-4133-4d57-b93a-1812b83badf2
10 1bb206dbdf18f75e4515aeef378ba50e555a9291 ima-ns sha256:795fb52db49b211450c7242dbcad00d782a7b8174f669c259f74a7ccabe03a90 /usr/bin/id e496e384-4133-4d57-b93a-1812b83badf2

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 include/linux/ima.h                       |  1 +
 security/integrity/ima/Kconfig            |  6 +++++-
 security/integrity/ima/ima_template.c     |  6 +++++-
 security/integrity/ima/ima_template_lib.c | 24 ++++++++++++++++++++++-
 security/integrity/ima/ima_template_lib.h |  4 ++++
 5 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index b6ab66a546ae..09b14b73889e 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -11,6 +11,7 @@
 #include <linux/fs.h>
 #include <linux/security.h>
 #include <linux/kexec.h>
+#include <linux/user_namespace.h>
 #include <crypto/hash_info.h>
 struct linux_binprm;
 
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index f3a9cc201c8c..4f0ce241b585 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -69,7 +69,8 @@ choice
 	  hash, defined as 20 bytes, and a null terminated pathname,
 	  limited to 255 characters.  The 'ima-ng' measurement list
 	  template permits both larger hash digests and longer
-	  pathnames.
+	  pathnames.  The 'ima-ns' adds the namespace uuid to the
+	  'ima-ng' template.
 
 	config IMA_TEMPLATE
 		bool "ima"
@@ -77,6 +78,8 @@ choice
 		bool "ima-ng (default)"
 	config IMA_SIG_TEMPLATE
 		bool "ima-sig"
+	config IMA_NS_TEMPLATE
+		bool "ima-ns"
 endchoice
 
 config IMA_DEFAULT_TEMPLATE
@@ -85,6 +88,7 @@ config IMA_DEFAULT_TEMPLATE
 	default "ima" if IMA_TEMPLATE
 	default "ima-ng" if IMA_NG_TEMPLATE
 	default "ima-sig" if IMA_SIG_TEMPLATE
+	default "ima-ns" if IMA_NS_TEMPLATE
 
 choice
 	prompt "Default integrity hash algorithm"
diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c
index 694560396be0..14e02eb3d0f3 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -24,6 +24,7 @@ static struct ima_template_desc builtin_templates[] = {
 	{.name = "ima-modsig", .fmt = "d-ng|n-ng|sig|d-modsig|modsig"},
 	{.name = "evm-sig",
 	 .fmt = "d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode"},
+	{.name = "ima-ns", .fmt = "d-ng|n-ng|ns"},
 	{.name = "", .fmt = ""},	/* placeholder for a custom format */
 };
 
@@ -64,6 +65,9 @@ static const struct ima_template_field supported_fields[] = {
 	{.field_id = "xattrvalues",
 	 .field_init = ima_eventinodexattrvalues_init,
 	 .field_show = ima_show_template_sig},
+	{.field_id = "ns",
+	 .field_init = ima_ns_init,
+	 .field_show = ima_show_template_uuid},
 };
 
 /*
@@ -72,7 +76,7 @@ static const struct ima_template_field supported_fields[] = {
  * description as 'd-ng' and 'n-ng' respectively.
  */
 #define MAX_TEMPLATE_NAME_LEN \
-	sizeof("d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode")
+	sizeof("d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode|ns")
 
 static struct ima_template_desc *ima_template;
 static struct ima_template_desc *ima_buf_template;
diff --git a/security/integrity/ima/ima_template_lib.c b/security/integrity/ima/ima_template_lib.c
index ca017cae73eb..ebd54c1b5206 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -26,7 +26,8 @@ enum data_formats {
 	DATA_FMT_DIGEST_WITH_ALGO,
 	DATA_FMT_STRING,
 	DATA_FMT_HEX,
-	DATA_FMT_UINT
+	DATA_FMT_UINT,
+	DATA_FMT_UUID,
 };
 
 static int ima_write_template_field_data(const void *data, const u32 datalen,
@@ -120,6 +121,9 @@ static void ima_show_template_data_ascii(struct seq_file *m,
 			break;
 		}
 		break;
+	case DATA_FMT_UUID:
+		seq_printf(m, "%pU", buf_ptr);
+		break;
 	default:
 		break;
 	}
@@ -202,6 +206,12 @@ void ima_show_template_uint(struct seq_file *m, enum ima_show_type show,
 	ima_show_template_field_data(m, show, DATA_FMT_UINT, field_data);
 }
 
+void ima_show_template_uuid(struct seq_file *m, enum ima_show_type show,
+			    struct ima_field_data *field_data)
+{
+	ima_show_template_field_data(m, show, DATA_FMT_UUID, field_data);
+}
+
 /**
  * ima_parse_buf() - Parses lengths and data from an input buffer
  * @bufstartp:       Buffer start address.
@@ -685,3 +695,15 @@ int ima_eventinodexattrvalues_init(struct ima_event_data *event_data,
 {
 	return ima_eventinodexattrs_init_common(event_data, field_data, 'v');
 }
+
+/*
+ *  ima_ns_init - include the namespace UUID as part of the template
+ *  data
+ */
+int ima_ns_init(struct ima_event_data *event_data,
+		struct ima_field_data *field_data)
+{
+	return ima_write_template_field_data(&current_user_ns()->uuid,
+					     UUID_SIZE, DATA_FMT_UUID,
+					     field_data);
+}
diff --git a/security/integrity/ima/ima_template_lib.h b/security/integrity/ima/ima_template_lib.h
index c71f1de95753..6ea2156271ae 100644
--- a/security/integrity/ima/ima_template_lib.h
+++ b/security/integrity/ima/ima_template_lib.h
@@ -29,6 +29,8 @@ void ima_show_template_buf(struct seq_file *m, enum ima_show_type show,
 			   struct ima_field_data *field_data);
 void ima_show_template_uint(struct seq_file *m, enum ima_show_type show,
 			    struct ima_field_data *field_data);
+void ima_show_template_uuid(struct seq_file *m, enum ima_show_type show,
+			    struct ima_field_data *field_data);
 int ima_parse_buf(void *bufstartp, void *bufendp, void **bufcurp,
 		  int maxfields, struct ima_field_data *fields, int *curfields,
 		  unsigned long *len_mask, int enforce_mask, char *bufname);
@@ -62,4 +64,6 @@ int ima_eventinodexattrlengths_init(struct ima_event_data *event_data,
 				    struct ima_field_data *field_data);
 int ima_eventinodexattrvalues_init(struct ima_event_data *event_data,
 				   struct ima_field_data *field_data);
+int ima_ns_init(struct ima_event_data *event_data,
+		struct ima_field_data *field_data);
 #endif /* __LINUX_IMA_TEMPLATE_LIB_H */
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-27 16:45 [RFC 0/3] Namespace IMA James Bottomley
  2021-11-27 16:45 ` [RFC 1/3] userns: add uuid field James Bottomley
  2021-11-27 16:45 ` [RFC 2/3] ima: Namespace IMA James Bottomley
@ 2021-11-27 16:45 ` James Bottomley
  2021-11-29  4:58   ` Serge E. Hallyn
  2 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-27 16:45 UTC (permalink / raw)
  To: linux-integrity
  Cc: containers, Mimi Zohar, Dmitry Kasatkin, Stefan Berger,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Serge E . Hallyn, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

Currently we get one entry in the IMA log per unique file event.  So,
if you have a measurement policy and it measures a particular binary
it will not get measured again if it is subsequently executed. For
Namespaced IMA, the correct behaviour seems to be to log once per
inode per namespace (so every unique execution in a namespace gets a
separate log entry).  Since logging once per inode per namespace is
different from current behaviour, it is only activated if the
namespace appears in the log template (so there's no behaviour change
for any of the original templates).

Expand the iint cache to have a list of namespaces, per iint entry,
the inode has been seen in by moving the action flags and the
measured_pcrs into a per-namespace structure.  The lifetime of these
additional list entries is tied to the lifetime of the iint entry and
the namespace, so if either is deleted, the new entry is.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
 include/linux/ima.h                       |  13 ++-
 include/linux/user_namespace.h            |   5 +
 kernel/user.c                             |   1 +
 kernel/user_namespace.c                   |   6 ++
 security/integrity/iint.c                 |   4 +-
 security/integrity/ima/Makefile           |   2 +-
 security/integrity/ima/ima.h              |  21 +++-
 security/integrity/ima/ima_api.c          |   7 +-
 security/integrity/ima/ima_main.c         |  21 ++--
 security/integrity/ima/ima_ns.c           | 115 ++++++++++++++++++++++
 security/integrity/ima/ima_policy.c       |   2 +-
 security/integrity/ima/ima_template_lib.c |   2 +
 security/integrity/integrity.h            |  11 ++-
 13 files changed, 192 insertions(+), 18 deletions(-)
 create mode 100644 security/integrity/ima/ima_ns.c

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 09b14b73889e..dbbc0257d065 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -40,7 +40,8 @@ extern int ima_measure_critical_data(const char *event_label,
 				     const char *event_name,
 				     const void *buf, size_t buf_len,
 				     bool hash, u8 *digest, size_t digest_len);
-
+extern void ima_init_user_ns(struct user_namespace *ns);
+extern void ima_free_user_ns(struct user_namespace *ns);
 #ifdef CONFIG_IMA_APPRAISE_BOOTPARAM
 extern void ima_appraise_parse_cmdline(void);
 #else
@@ -154,6 +155,16 @@ static inline int ima_measure_critical_data(const char *event_label,
 	return -ENOENT;
 }
 
+static inline void ima_init_user_ns(struct user_namespace *ns)
+{
+	return;
+}
+
+static inline void ima_free_user_ns(struct user_namespace *ns)
+{
+	return;
+}
+
 #endif /* CONFIG_IMA */
 
 #ifndef CONFIG_IMA_KEXEC
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index d155788abdc1..52968764b195 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -8,6 +8,7 @@
 #include <linux/sched.h>
 #include <linux/workqueue.h>
 #include <linux/rwsem.h>
+#include <linux/rwlock.h>
 #include <linux/sysctl.h>
 #include <linux/err.h>
 #include <linux/uuid.h>
@@ -101,6 +102,10 @@ struct user_namespace {
 	struct ucounts		*ucounts;
 	long ucount_max[UCOUNT_COUNTS];
 	uuid_t			uuid;
+#ifdef CONFIG_IMA
+	struct list_head	ima_inode_list;
+	rwlock_t		ima_inode_list_lock;
+#endif
 } __randomize_layout;
 
 struct ucounts {
diff --git a/kernel/user.c b/kernel/user.c
index bf9ae1d0b670..670d3c41c817 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -68,6 +68,7 @@ struct user_namespace init_user_ns = {
 	.keyring_sem = __RWSEM_INITIALIZER(init_user_ns.keyring_sem),
 #endif
 	/* .uuid is initialized in user_namespaces_init() */
+	/* all IMA fields are initialized by ima_init_user_ns() */
 };
 EXPORT_SYMBOL_GPL(init_user_ns);
 
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 8ce57c16ddd3..8afa5dc0b992 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include <linux/export.h>
+#include <linux/ima.h>
 #include <linux/nsproxy.h>
 #include <linux/slab.h>
 #include <linux/sched/signal.h>
@@ -144,6 +145,9 @@ int create_user_ns(struct cred *new)
 	uuid_gen(&ns->uuid);
 
 	set_cred_user_ns(new, ns);
+
+	ima_init_user_ns(ns);
+
 	return 0;
 fail_keyring:
 #ifdef CONFIG_PERSISTENT_KEYRINGS
@@ -200,6 +204,7 @@ static void free_user_ns(struct work_struct *work)
 		}
 		retire_userns_sysctls(ns);
 		key_free_user_ns(ns);
+		ima_free_user_ns(ns);
 		ns_free_inum(&ns->ns);
 		kmem_cache_free(user_ns_cachep, ns);
 		dec_user_namespaces(ucounts);
@@ -1389,6 +1394,7 @@ static __init int user_namespaces_init(void)
 {
 	user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC | SLAB_ACCOUNT);
 	uuid_gen(&init_user_ns.uuid);
+	ima_init_user_ns(&init_user_ns);
 	return 0;
 }
 subsys_initcall(user_namespaces_init);
diff --git a/security/integrity/iint.c b/security/integrity/iint.c
index 8638976f7990..f714532feb7d 100644
--- a/security/integrity/iint.c
+++ b/security/integrity/iint.c
@@ -71,6 +71,7 @@ struct integrity_iint_cache *integrity_iint_find(struct inode *inode)
 static void iint_free(struct integrity_iint_cache *iint)
 {
 	kfree(iint->ima_hash);
+	ima_ns_iint_list_free(iint);
 	iint->ima_hash = NULL;
 	iint->version = 0;
 	iint->flags = 0UL;
@@ -81,7 +82,7 @@ static void iint_free(struct integrity_iint_cache *iint)
 	iint->ima_read_status = INTEGRITY_UNKNOWN;
 	iint->ima_creds_status = INTEGRITY_UNKNOWN;
 	iint->evm_status = INTEGRITY_UNKNOWN;
-	iint->measured_pcrs = 0;
+	INIT_LIST_HEAD(&iint->ns_list);
 	kmem_cache_free(iint_cache, iint);
 }
 
@@ -170,6 +171,7 @@ static void init_once(void *foo)
 	iint->ima_creds_status = INTEGRITY_UNKNOWN;
 	iint->evm_status = INTEGRITY_UNKNOWN;
 	mutex_init(&iint->mutex);
+	INIT_LIST_HEAD(&iint->ns_list);
 }
 
 static int __init integrity_iintcache_init(void)
diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
index 2499f2485c04..1741aa5d97bc 100644
--- a/security/integrity/ima/Makefile
+++ b/security/integrity/ima/Makefile
@@ -7,7 +7,7 @@
 obj-$(CONFIG_IMA) += ima.o
 
 ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
-	 ima_policy.o ima_template.o ima_template_lib.o
+	 ima_policy.o ima_template.o ima_template_lib.o ima_ns.o
 ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
 ima-$(CONFIG_IMA_APPRAISE_MODSIG) += ima_modsig.o
 ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index be965a8715e4..047256be7195 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -119,6 +119,16 @@ struct ima_kexec_hdr {
 	u64 count;
 };
 
+/* IMA cache of per-user-namespace flags */
+struct ima_ns_cache {
+	struct user_namespace *ns;
+	struct integrity_iint_cache *iint;
+	struct list_head ns_list;
+	struct list_head iint_list;
+	unsigned long measured_pcrs;
+	unsigned long flags;
+};
+
 extern const int read_idmap[];
 
 #ifdef CONFIG_HAVE_IMA_KEXEC
@@ -263,7 +273,8 @@ int ima_must_measure(struct inode *inode, int mask, enum ima_hooks func);
 int ima_collect_measurement(struct integrity_iint_cache *iint,
 			    struct file *file, void *buf, loff_t size,
 			    enum hash_algo algo, struct modsig *modsig);
-void ima_store_measurement(struct integrity_iint_cache *iint, struct file *file,
+void ima_store_measurement(struct integrity_iint_cache *iint,
+			   struct ima_ns_cache *nsc, struct file *file,
 			   const unsigned char *filename,
 			   struct evm_ima_xattr_data *xattr_value,
 			   int xattr_len, const struct modsig *modsig, int pcr,
@@ -301,6 +312,14 @@ void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos);
 void ima_policy_stop(struct seq_file *m, void *v);
 int ima_policy_show(struct seq_file *m, void *v);
 
+/* IMA Namespace related functions */
+extern bool ima_ns_in_template;
+struct ima_ns_cache *ima_ns_cache_get(struct integrity_iint_cache *iint,
+				      struct user_namespace *ns);
+void ima_ns_cache_clear(struct integrity_iint_cache *iint);
+void ima_init_ns(void);
+
+
 /* Appraise integrity measurements */
 #define IMA_APPRAISE_ENFORCE	0x01
 #define IMA_APPRAISE_FIX	0x02
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index a64fb0130b01..d613ea1ee378 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -298,6 +298,7 @@ int ima_collect_measurement(struct integrity_iint_cache *iint,
  * Must be called with iint->mutex held.
  */
 void ima_store_measurement(struct integrity_iint_cache *iint,
+			   struct ima_ns_cache *nsc,
 			   struct file *file, const unsigned char *filename,
 			   struct evm_ima_xattr_data *xattr_value,
 			   int xattr_len, const struct modsig *modsig, int pcr,
@@ -322,7 +323,7 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
 	 * appraisal, but a file measurement from earlier might already exist in
 	 * the measurement list.
 	 */
-	if (iint->measured_pcrs & (0x1 << pcr) && !modsig)
+	if (nsc->measured_pcrs & (0x1 << pcr) && !modsig)
 		return;
 
 	result = ima_alloc_init_template(&event_data, &entry, template_desc);
@@ -334,8 +335,8 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
 
 	result = ima_store_template(entry, violation, inode, filename, pcr);
 	if ((!result || result == -EEXIST) && !(file->f_flags & O_DIRECT)) {
-		iint->flags |= IMA_MEASURED;
-		iint->measured_pcrs |= (0x1 << pcr);
+		nsc->flags |= IMA_MEASURED;
+		nsc->measured_pcrs |= (0x1 << pcr);
 	}
 	if (result < 0)
 		ima_free_template_entry(entry);
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 465865412100..049710203fac 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -168,8 +168,8 @@ static void ima_check_last_writer(struct integrity_iint_cache *iint,
 		if (!IS_I_VERSION(inode) ||
 		    !inode_eq_iversion(inode, iint->version) ||
 		    (iint->flags & IMA_NEW_FILE)) {
-			iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
-			iint->measured_pcrs = 0;
+			iint->flags &= ~IMA_NEW_FILE;
+			ima_ns_cache_clear(iint);
 			if (update)
 				ima_update_xattr(iint, file);
 		}
@@ -204,6 +204,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 {
 	struct inode *inode = file_inode(file);
 	struct integrity_iint_cache *iint = NULL;
+	struct ima_ns_cache *nsc = NULL;
 	struct ima_template_desc *template_desc = NULL;
 	char *pathbuf = NULL;
 	char filename[NAME_MAX];
@@ -274,20 +275,20 @@ static int process_measurement(struct file *file, const struct cred *cred,
 	    ((inode->i_sb->s_iflags & SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
 	     !(inode->i_sb->s_iflags & SB_I_UNTRUSTED_MOUNTER) &&
 	     !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
-		iint->flags &= ~IMA_DONE_MASK;
-		iint->measured_pcrs = 0;
+		ima_ns_cache_clear(iint);
 	}
+	nsc = ima_ns_cache_get(iint, current_user_ns());
 
 	/* Determine if already appraised/measured based on bitmask
 	 * (IMA_MEASURE, IMA_MEASURED, IMA_XXXX_APPRAISE, IMA_XXXX_APPRAISED,
 	 *  IMA_AUDIT, IMA_AUDITED)
 	 */
-	iint->flags |= action;
+	nsc->flags |= action;
 	action &= IMA_DO_MASK;
-	action &= ~((iint->flags & (IMA_DONE_MASK ^ IMA_MEASURED)) >> 1);
+	action &= ~((nsc->flags & (IMA_DONE_MASK ^ IMA_MEASURED)) >> 1);
 
 	/* If target pcr is already measured, unset IMA_MEASURE action */
-	if ((action & IMA_MEASURE) && (iint->measured_pcrs & (0x1 << pcr)))
+	if ((action & IMA_MEASURE) && (nsc->measured_pcrs & (0x1 << pcr)))
 		action ^= IMA_MEASURE;
 
 	/* HASH sets the digital signature and update flags, nothing else */
@@ -297,7 +298,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 		if ((xattr_value && xattr_len > 2) &&
 		    (xattr_value->type == EVM_IMA_XATTR_DIGSIG))
 			set_bit(IMA_DIGSIG, &iint->atomic_flags);
-		iint->flags |= IMA_HASHED;
+		nsc->flags |= IMA_HASHED;
 		action ^= IMA_HASH;
 		set_bit(IMA_UPDATE_XATTR, &iint->atomic_flags);
 	}
@@ -342,7 +343,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
 		pathname = ima_d_path(&file->f_path, &pathbuf, filename);
 
 	if (action & IMA_MEASURE)
-		ima_store_measurement(iint, file, pathname,
+		ima_store_measurement(iint, nsc, file, pathname,
 				      xattr_value, xattr_len, modsig, pcr,
 				      template_desc);
 	if (rc == 0 && (action & IMA_APPRAISE_SUBMASK)) {
@@ -1047,6 +1048,8 @@ static int __init init_ima(void)
 	if (error)
 		return error;
 
+	ima_init_ns();
+
 	error = register_blocking_lsm_notifier(&ima_lsm_policy_notifier);
 	if (error)
 		pr_warn("Couldn't register LSM notifier, error %d\n", error);
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
new file mode 100644
index 000000000000..7134ba7f3544
--- /dev/null
+++ b/security/integrity/ima/ima_ns.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * IMA Namespace Support routines
+ */
+
+#include <linux/ima.h>
+#include <linux/user_namespace.h>
+
+#include "ima.h"
+
+static struct kmem_cache *ns_cache __read_mostly;
+bool ima_ns_in_template __read_mostly;
+
+void __init ima_init_ns(void)
+{
+	ns_cache = KMEM_CACHE(ima_ns_cache, SLAB_PANIC);
+}
+
+void ima_init_user_ns(struct user_namespace *ns)
+{
+	INIT_LIST_HEAD(&ns->ima_inode_list);
+}
+
+void ima_free_user_ns(struct user_namespace *ns)
+{
+	struct ima_ns_cache *entry;
+
+	/* no refs to ns left, so no need to lock */
+	while (!list_empty(&ns->ima_inode_list)) {
+		entry = list_entry(ns->ima_inode_list.next, struct ima_ns_cache,
+				   ns_list);
+
+		/* iint cache entry is still active to lock to delete */
+		write_lock(&entry->iint->ns_list_lock);
+		list_del(&entry->iint_list);
+		write_unlock(&entry->iint->ns_list_lock);
+
+		list_del(&entry->ns_list);
+		kmem_cache_free(ns_cache, entry);
+	}
+}
+
+struct ima_ns_cache *ima_ns_cache_get(struct integrity_iint_cache *iint,
+				      struct user_namespace *ns)
+{
+	struct ima_ns_cache *entry = NULL;
+
+	if (!ima_ns_in_template)
+		/*
+		 * if we're not logging the namespace, don't separate the
+		 * iint cache per namespace.  This preserves original
+		 * behaviour for the non-ns case.
+		 */
+		ns = &init_user_ns;
+
+	read_lock(&iint->ns_list_lock);
+	list_for_each_entry(entry, &iint->ns_list, ns_list)
+		if (entry->ns == ns)
+			break;
+	read_unlock(&iint->ns_list_lock);
+
+	if (entry && entry->ns == ns)
+		return entry;
+
+	entry = kmem_cache_zalloc(ns_cache, GFP_NOFS);
+	if (!entry)
+		return NULL;
+
+	/* no refs taken: entry is freed on either ns delete or iint delete */
+	entry->ns = ns;
+	entry->iint = iint;
+
+	write_lock(&iint->ns_list_lock);
+	list_add_tail(&entry->iint_list, &iint->ns_list);
+	write_unlock(&iint->ns_list_lock);
+
+	write_lock(&ns->ima_inode_list_lock);
+	list_add_tail(&entry->ns_list, &ns->ima_inode_list);
+	write_unlock(&ns->ima_inode_list_lock);
+
+	return entry;
+}
+
+/* clear the flags and measured PCR for every entry in the iint  */
+void ima_ns_cache_clear(struct integrity_iint_cache *iint)
+{
+	struct ima_ns_cache *entry;
+
+	read_lock(&iint->ns_list_lock);
+	list_for_each_entry(entry, &iint->ns_list, ns_list) {
+		entry->flags = 0;
+		entry->measured_pcrs = 0;
+	}
+	read_unlock(&iint->ns_list_lock);
+}
+
+void ima_ns_iint_list_free(struct integrity_iint_cache *iint)
+{
+	struct ima_ns_cache *entry;
+
+	/* iint locking unnecessary; no-one should have acces to the list */
+	while (!list_empty(&iint->ns_list)) {
+		entry = list_entry(iint->ns_list.next, struct ima_ns_cache,
+				   iint_list);
+
+		/* namespace is still active to lock to delete */
+		write_lock(&entry->ns->ima_inode_list_lock);
+		list_del(&entry->ns_list);
+		write_unlock(&entry->ns->ima_inode_list_lock);
+
+		list_del(&entry->iint_list);
+		kmem_cache_free(ns_cache, entry);
+	}
+}
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 320ca80aacab..9434a1064da6 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -50,7 +50,7 @@
 #define DONT_HASH	0x0200
 
 #define INVALID_PCR(a) (((a) < 0) || \
-	(a) >= (sizeof_field(struct integrity_iint_cache, measured_pcrs) * 8))
+	(a) >= (sizeof_field(struct ima_ns_cache, measured_pcrs) * 8))
 
 int ima_policy_flag;
 static int temp_ima_appraise;
diff --git a/security/integrity/ima/ima_template_lib.c b/security/integrity/ima/ima_template_lib.c
index ebd54c1b5206..d1a381add127 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -703,6 +703,8 @@ int ima_eventinodexattrvalues_init(struct ima_event_data *event_data,
 int ima_ns_init(struct ima_event_data *event_data,
 		struct ima_field_data *field_data)
 {
+	if (unlikely(!ima_ns_in_template))
+		ima_ns_in_template = true;
 	return ima_write_template_field_data(&current_user_ns()->uuid,
 					     UUID_SIZE, DATA_FMT_UUID,
 					     field_data);
diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
index 547425c20e11..8755635da3ff 100644
--- a/security/integrity/integrity.h
+++ b/security/integrity/integrity.h
@@ -129,7 +129,6 @@ struct integrity_iint_cache {
 	struct inode *inode;	/* back pointer to inode in question */
 	u64 version;		/* track inode changes */
 	unsigned long flags;
-	unsigned long measured_pcrs;
 	unsigned long atomic_flags;
 	enum integrity_status ima_file_status:4;
 	enum integrity_status ima_mmap_status:4;
@@ -138,6 +137,8 @@ struct integrity_iint_cache {
 	enum integrity_status ima_creds_status:4;
 	enum integrity_status evm_status:4;
 	struct ima_digest_data *ima_hash;
+	struct list_head ns_list; /* list of namespaces inode seen in */
+	rwlock_t ns_list_lock;
 };
 
 /* rbtree tree calls to lookup, insert, delete
@@ -225,6 +226,14 @@ static inline void ima_load_x509(void)
 }
 #endif
 
+#ifdef CONFIG_IMA
+void ima_ns_iint_list_free(struct integrity_iint_cache *iint);
+#else
+void ima_ns_iint_list_free(struct integrity_iint_cache *iint)
+{
+}
+#endif
+
 #ifdef CONFIG_EVM_LOAD_X509
 void __init evm_load_x509(void);
 #else
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-27 16:45 ` [RFC 1/3] userns: add uuid field James Bottomley
@ 2021-11-28  4:45   ` Serge E. Hallyn
  2021-11-28 13:29     ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-28  4:45 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Serge E . Hallyn, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley wrote:
> As a precursor to namespacing IMA a way of uniquely identifying the
> namespace to appear in the IMA log is needed.  This log may be
> transported away from the running system and may be analyzed even
> after the system has been rebooted.  Thus we need a way of identifying
> namespaces in the log which is unique.  UUID, being designed
> probabilistically never to repeat, fits this bill so add it to the
> user_namespace which we'll also use for namespacing IMA.

If the logs run across 5 boots, is it important to you that the
uuid be unique across all 5 boots?  Would it suffice to have a
per-boot unique count and report that plus some indicator of the
current boot (like boot time in jiffies)?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28  4:45   ` Serge E. Hallyn
@ 2021-11-28 13:29     ` James Bottomley
  2021-11-28 15:18       ` Serge E. Hallyn
  0 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-28 13:29 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley wrote:
> > As a precursor to namespacing IMA a way of uniquely identifying the
> > namespace to appear in the IMA log is needed.  This log may be
> > transported away from the running system and may be analyzed even
> > after the system has been rebooted.  Thus we need a way of
> > identifying namespaces in the log which is unique.  UUID, being
> > designed probabilistically never to repeat, fits this bill so add
> > it to the user_namespace which we'll also use for namespacing IMA.
> 
> If the logs run across 5 boots, is it important to you that the
> uuid be unique across all 5 boots?  Would it suffice to have a
> per-boot unique count and report that plus some indicator of the
> current boot (like boot time in jiffies)?

For the purposes of IMA it's only really important to have the uuid be
unique within the particular log ... i.e. unique per boot.  However,
given the prevalence of uuids elsewhere and the fact we have no current
per-boot unique label for the namespace (the inode number could
repeat), it seemed reasonable to employ uuids for this rather than
invent a different identifier.  Plus IMA isn't going to complain if we
have a globally unique identifier ...

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 13:29     ` James Bottomley
@ 2021-11-28 15:18       ` Serge E. Hallyn
  2021-11-28 18:00         ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-28 15:18 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Stefan Berger, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sun, Nov 28, 2021 at 08:29:21AM -0500, James Bottomley wrote:
> On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> > On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley wrote:
> > > As a precursor to namespacing IMA a way of uniquely identifying the
> > > namespace to appear in the IMA log is needed.  This log may be
> > > transported away from the running system and may be analyzed even
> > > after the system has been rebooted.  Thus we need a way of
> > > identifying namespaces in the log which is unique.  UUID, being
> > > designed probabilistically never to repeat, fits this bill so add
> > > it to the user_namespace which we'll also use for namespacing IMA.
> > 
> > If the logs run across 5 boots, is it important to you that the
> > uuid be unique across all 5 boots?  Would it suffice to have a
> > per-boot unique count and report that plus some indicator of the
> > current boot (like boot time in jiffies)?
> 
> For the purposes of IMA it's only really important to have the uuid be
> unique within the particular log ... i.e. unique per boot.  However,
> given the prevalence of uuids elsewhere and the fact we have no current
> per-boot unique label for the namespace (the inode number could
> repeat), it seemed reasonable to employ uuids for this rather than
> invent a different identifier.  Plus IMA isn't going to complain if we
> have a globally unique identifier ...

Ok - Note I'm not saying I heavily object, but I'm mildly concerned
about users who happen to spin off a lot of user namespaces for
quick jobs being penalized.  I suspect Eric will also worry about the
namespacing implications - i.e. people *will* want to start restoring
user namespaces with a previously used uuid.

So given that 'unique per boot' is sufficient, what would be the problem
with simply adding a simple ever-increasing unique atomix count to the
struct user_namespace?

-serge

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 15:18       ` Serge E. Hallyn
@ 2021-11-28 18:00         ` James Bottomley
  2021-11-28 20:47           ` Serge E. Hallyn
  0 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-28 18:00 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> On Sun, Nov 28, 2021 at 08:29:21AM -0500, James Bottomley wrote:
> > On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> > > On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley wrote:
> > > > As a precursor to namespacing IMA a way of uniquely identifying
> > > > the namespace to appear in the IMA log is needed.  This log may
> > > > be transported away from the running system and may be analyzed
> > > > even after the system has been rebooted.  Thus we need a way of
> > > > identifying namespaces in the log which is unique.  UUID, being
> > > > designed probabilistically never to repeat, fits this bill so
> > > > add it to the user_namespace which we'll also use for
> > > > namespacing IMA.
> > > 
> > > If the logs run across 5 boots, is it important to you that the
> > > uuid be unique across all 5 boots?  Would it suffice to have a
> > > per-boot unique count and report that plus some indicator of the
> > > current boot (like boot time in jiffies)?
> > 
> > For the purposes of IMA it's only really important to have the uuid
> > be unique within the particular log ... i.e. unique per
> > boot.  However, given the prevalence of uuids elsewhere and the
> > fact we have no current per-boot unique label for the namespace
> > (the inode number could repeat), it seemed reasonable to employ
> > uuids for this rather than invent a different identifier.  Plus IMA
> > isn't going to complain if we have a globally unique identifier ...
> 
> Ok - Note I'm not saying I heavily object, but I'm mildly concerned
> about users who happen to spin off a lot of user namespaces for
> quick jobs being penalized.

Well, that's why I use the uuid_gen coupled to prandom ... there
shouldn't be a measurable overhead generating it.

>   I suspect Eric will also worry about the namespacing implications -
> i.e. people *will* want to start restoring user namespaces with a
> previously used uuid.

So this is a problem I tried to address in the last paragraph.  If I
put any marker on a namespace, people are potentially going to want to
save and restore it. The bottom line is that ima logs are add only. 
You can't save and restore them so we're already dealing with something
that can't be CRIU transported.  I had hoped that it would be obvious
that a randomly generated uuid, whose uniqueness depends on random
generation likewise can't be saved and restored because we'd have no
way to prevent a clash.

> So given that 'unique per boot' is sufficient, what would be the
> problem with simply adding a simple ever-increasing unique atomix
> count to the struct user_namespace?

I don't think there is any ... but I equally don't see why people would
want to save and restore the uuid but not the new monotonic identifier
... because it's still just a marker on a namespace.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 18:00         ` James Bottomley
@ 2021-11-28 20:47           ` Serge E. Hallyn
  2021-11-28 21:21             ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-28 20:47 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Stefan Berger, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> > On Sun, Nov 28, 2021 at 08:29:21AM -0500, James Bottomley wrote:
> > > On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> > > > On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley wrote:
> > > > > As a precursor to namespacing IMA a way of uniquely identifying
> > > > > the namespace to appear in the IMA log is needed.  This log may
> > > > > be transported away from the running system and may be analyzed
> > > > > even after the system has been rebooted.  Thus we need a way of
> > > > > identifying namespaces in the log which is unique.  UUID, being
> > > > > designed probabilistically never to repeat, fits this bill so
> > > > > add it to the user_namespace which we'll also use for
> > > > > namespacing IMA.
> > > > 
> > > > If the logs run across 5 boots, is it important to you that the
> > > > uuid be unique across all 5 boots?  Would it suffice to have a
> > > > per-boot unique count and report that plus some indicator of the
> > > > current boot (like boot time in jiffies)?
> > > 
> > > For the purposes of IMA it's only really important to have the uuid
> > > be unique within the particular log ... i.e. unique per
> > > boot.  However, given the prevalence of uuids elsewhere and the
> > > fact we have no current per-boot unique label for the namespace
> > > (the inode number could repeat), it seemed reasonable to employ
> > > uuids for this rather than invent a different identifier.  Plus IMA
> > > isn't going to complain if we have a globally unique identifier ...
> > 
> > Ok - Note I'm not saying I heavily object, but I'm mildly concerned
> > about users who happen to spin off a lot of user namespaces for
> > quick jobs being penalized.
> 
> Well, that's why I use the uuid_gen coupled to prandom ... there
> shouldn't be a measurable overhead generating it.

Does prandom have *no*, or just little effect on the entopy pool?
Tried briefly looking at prandom_u32, not quite getting how it's
using net_rand_state - it reads it and uses it but doesn't make
any changes to it?

> >   I suspect Eric will also worry about the namespacing implications -
> > i.e. people *will* want to start restoring user namespaces with a
> > previously used uuid.
> 
> So this is a problem I tried to address in the last paragraph.  If I
> put any marker on a namespace, people are potentially going to want to
> save and restore it. The bottom line is that ima logs are add only. 
> You can't save and restore them so we're already dealing with something
> that can't be CRIU transported.  I had hoped that it would be obvious
> that a randomly generated uuid, whose uniqueness depends on random
> generation likewise can't be saved and restored because we'd have no
> way to prevent a clash.

Yes but you're making this a general user_namespace struct member.
So once that's there people will want to export it, use it for
things other than ima.

> > So given that 'unique per boot' is sufficient, what would be the
> > problem with simply adding a simple ever-increasing unique atomix
> > count to the struct user_namespace?
> 
> I don't think there is any ... but I equally don't see why people would
> want to save and restore the uuid but not the new monotonic identifier
> ... because it's still just a marker on a namespace.

But you've called it "the namespace uuid".  I'm not even really thinking
of checkpoint/restart, just stopping and restarting a container.  I'm
convinced people will want to start using it because, well, it is a nice
feature.

-serge

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 20:47           ` Serge E. Hallyn
@ 2021-11-28 21:21             ` James Bottomley
  2021-11-28 21:49               ` Serge E. Hallyn
  0 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-28 21:21 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> > > On Sun, Nov 28, 2021 at 08:29:21AM -0500, James Bottomley wrote:
> > > > On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> > > > > On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley
> > > > > wrote:
> > > > > > As a precursor to namespacing IMA a way of uniquely
> > > > > > identifying the namespace to appear in the IMA log is
> > > > > > needed.  This log may be transported away from the running
> > > > > > system and may be analyzed even after the system has been
> > > > > > rebooted.  Thus we need a way of identifying namespaces in
> > > > > > the log which is unique.  UUID, being designed
> > > > > > probabilistically never to repeat, fits this bill
> > > > > > so add it to the user_namespace which we'll also use for
> > > > > > namespacing IMA.
> > > > > 
> > > > > If the logs run across 5 boots, is it important to you that
> > > > > the uuid be unique across all 5 boots?  Would it suffice to
> > > > > have a per-boot unique count and report that plus some
> > > > > indicator of the current boot (like boot time in jiffies)?
> > > > 
> > > > For the purposes of IMA it's only really important to have the
> > > > uuid be unique within the particular log ... i.e. unique per
> > > > boot.  However, given the prevalence of uuids elsewhere and the
> > > > fact we have no current per-boot unique label for the namespace
> > > > (the inode number could repeat), it seemed reasonable to employ
> > > > uuids for this rather than invent a different identifier.  Plus
> > > > IMA isn't going to complain if we have a globally unique
> > > > identifier
> > > > ...
> > > 
> > > Ok - Note I'm not saying I heavily object, but I'm mildly
> > > concerned about users who happen to spin off a lot of user
> > > namespaces for quick jobs being penalized.
> > 
> > Well, that's why I use the uuid_gen coupled to prandom ... there
> > shouldn't be a measurable overhead generating it.
> 
> Does prandom have *no*, or just little effect on the entopy pool?
> Tried briefly looking at prandom_u32, not quite getting how it's
> using net_rand_state - it reads it and uses it but doesn't make
> any changes to it?

It has a first use effect to get the seed but once that happens it has
no further effect on the entropy pool.

> > >   I suspect Eric will also worry about the namespacing
> > > implications - i.e. people *will* want to start restoring user
> > > namespaces with a previously used uuid.
> > 
> > So this is a problem I tried to address in the last paragraph.  If
> > I put any marker on a namespace, people are potentially going to
> > want to save and restore it. The bottom line is that ima logs are
> > add only.  You can't save and restore them so we're already dealing
> > with something that can't be CRIU transported.  I had hoped that it
> > would be obvious that a randomly generated uuid, whose uniqueness
> > depends on random generation likewise can't be saved and restored
> > because we'd have no way to prevent a clash.
> 
> Yes but you're making this a general user_namespace struct member.
> So once that's there people will want to export it, use it for
> things other than ima.

Yes, that's why I did it.  However, the property of uniqueness for all
uuid type things depends on randomness, so ipso facto, they can never
be settable.

> > > So given that 'unique per boot' is sufficient, what would be the
> > > problem with simply adding a simple ever-increasing unique atomix
> > > count to the struct user_namespace?
> > 
> > I don't think there is any ... but I equally don't see why people
> > would want to save and restore the uuid but not the new monotonic
> > identifier ... because it's still just a marker on a namespace.
> 
> But you've called it "the namespace uuid".  I'm not even really
> thinking of checkpoint/restart, just stopping and restarting a
> container.  I'm convinced people will want to start using it because,
> well, it is a nice feature.

Right, but the uniqueness property depends on you not being able to set
it.  If you just want a namespace label, you can have that, but
anything a user can set is either a pain to guarantee uniqueness (have
to check all the other objects) or is simply a non-unique label.

If you want to label a container, which could have many namespaces and
be stopped and restarted many times, it does sound like you want a non-
unique settable label.  However, IMA definitely needs a guaranteed per
namespace unique label.

Is the objection simply you think a UUID sound like it should be
settable and a monotonic counter sounds like it shouldn't?  Because to
me (coming I suppose from dealing with uuids in edk2) neither sounds
like it should be settable.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 21:21             ` James Bottomley
@ 2021-11-28 21:49               ` Serge E. Hallyn
  2021-11-28 22:56                 ` James Bottomley
  2021-11-29 13:12                 ` Christian Brauner
  0 siblings, 2 replies; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-28 21:49 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Stefan Berger, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
> On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> > > > On Sun, Nov 28, 2021 at 08:29:21AM -0500, James Bottomley wrote:
> > > > > On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> > > > > > On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley
> > > > > > wrote:
> > > > > > > As a precursor to namespacing IMA a way of uniquely
> > > > > > > identifying the namespace to appear in the IMA log is
> > > > > > > needed.  This log may be transported away from the running
> > > > > > > system and may be analyzed even after the system has been
> > > > > > > rebooted.  Thus we need a way of identifying namespaces in
> > > > > > > the log which is unique.  UUID, being designed
> > > > > > > probabilistically never to repeat, fits this bill
> > > > > > > so add it to the user_namespace which we'll also use for
> > > > > > > namespacing IMA.
> > > > > > 
> > > > > > If the logs run across 5 boots, is it important to you that
> > > > > > the uuid be unique across all 5 boots?  Would it suffice to
> > > > > > have a per-boot unique count and report that plus some
> > > > > > indicator of the current boot (like boot time in jiffies)?
> > > > > 
> > > > > For the purposes of IMA it's only really important to have the
> > > > > uuid be unique within the particular log ... i.e. unique per
> > > > > boot.  However, given the prevalence of uuids elsewhere and the
> > > > > fact we have no current per-boot unique label for the namespace
> > > > > (the inode number could repeat), it seemed reasonable to employ
> > > > > uuids for this rather than invent a different identifier.  Plus
> > > > > IMA isn't going to complain if we have a globally unique
> > > > > identifier
> > > > > ...
> > > > 
> > > > Ok - Note I'm not saying I heavily object, but I'm mildly
> > > > concerned about users who happen to spin off a lot of user
> > > > namespaces for quick jobs being penalized.
> > > 
> > > Well, that's why I use the uuid_gen coupled to prandom ... there
> > > shouldn't be a measurable overhead generating it.
> > 
> > Does prandom have *no*, or just little effect on the entopy pool?
> > Tried briefly looking at prandom_u32, not quite getting how it's
> > using net_rand_state - it reads it and uses it but doesn't make
> > any changes to it?
> 
> It has a first use effect to get the seed but once that happens it has
> no further effect on the entropy pool.

Gotcha - thanks.

> > > >   I suspect Eric will also worry about the namespacing
> > > > implications - i.e. people *will* want to start restoring user
> > > > namespaces with a previously used uuid.
> > > 
> > > So this is a problem I tried to address in the last paragraph.  If
> > > I put any marker on a namespace, people are potentially going to
> > > want to save and restore it. The bottom line is that ima logs are
> > > add only.  You can't save and restore them so we're already dealing
> > > with something that can't be CRIU transported.  I had hoped that it
> > > would be obvious that a randomly generated uuid, whose uniqueness
> > > depends on random generation likewise can't be saved and restored
> > > because we'd have no way to prevent a clash.
> > 
> > Yes but you're making this a general user_namespace struct member.
> > So once that's there people will want to export it, use it for
> > things other than ima.
> 
> Yes, that's why I did it.  However, the property of uniqueness for all
> uuid type things depends on randomness, so ipso facto, they can never
> be settable.
> 
> > > > So given that 'unique per boot' is sufficient, what would be the
> > > > problem with simply adding a simple ever-increasing unique atomix
> > > > count to the struct user_namespace?
> > > 
> > > I don't think there is any ... but I equally don't see why people
> > > would want to save and restore the uuid but not the new monotonic
> > > identifier ... because it's still just a marker on a namespace.
> > 
> > But you've called it "the namespace uuid".  I'm not even really
> > thinking of checkpoint/restart, just stopping and restarting a
> > container.  I'm convinced people will want to start using it because,
> > well, it is a nice feature.
> 
> Right, but the uniqueness property depends on you not being able to set
> it.  If you just want a namespace label, you can have that, but
> anything a user can set is either a pain to guarantee uniqueness (have
> to check all the other objects) or is simply a non-unique label.
> 
> If you want to label a container, which could have many namespaces and
> be stopped and restarted many times, it does sound like you want a non-
> unique settable label.  However, IMA definitely needs a guaranteed per
> namespace unique label.
> 
> Is the objection simply you think a UUID sound like it should be

Objection is too strong.  Concern.

But yes, to me a uuid (a) feels like it should be generally useful
including being settable and (b) not super duper 100% absolutely
guaranteed to always be unique per boot, as an incremented counter
would be.

> settable and a monotonic counter sounds like it shouldn't?  Because to
> me (coming I suppose from dealing with uuids in edk2) neither sounds
> like it should be settable.

Huh - yes, in contrast, for virtualization based tests of secureboot and
fs-y things we do indeed always set the uuids.

-serge

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 21:49               ` Serge E. Hallyn
@ 2021-11-28 22:56                 ` James Bottomley
  2021-11-29  1:59                   ` Serge E. Hallyn
  2021-11-29 13:12                 ` Christian Brauner
  1 sibling, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-28 22:56 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Sun, 2021-11-28 at 15:49 -0600, Serge E. Hallyn wrote:
> On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
> > On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> > > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
[...]
> > > > > So given that 'unique per boot' is sufficient, what would be
> > > > > the problem with simply adding a simple ever-increasing
> > > > > unique atomix count to the struct user_namespace?
> > > > 
> > > > I don't think there is any ... but I equally don't see why
> > > > people would want to save and restore the uuid but not the new
> > > > monotonic identifier ... because it's still just a marker on a
> > > > namespace.
> > > 
> > > But you've called it "the namespace uuid".  I'm not even really
> > > thinking of checkpoint/restart, just stopping and restarting a
> > > container.  I'm convinced people will want to start using it
> > > because, well, it is a nice feature.
> > 
> > Right, but the uniqueness property depends on you not being able to
> > set it.  If you just want a namespace label, you can have that, but
> > anything a user can set is either a pain to guarantee uniqueness
> > (have to check all the other objects) or is simply a non-unique
> > label.
> > 
> > If you want to label a container, which could have many namespaces
> > and be stopped and restarted many times, it does sound like you
> > want a non-unique settable label.  However, IMA definitely needs a
> > guaranteed per namespace unique label.
> > 
> > Is the objection simply you think a UUID sound like it should be
> 
> Objection is too strong.  Concern.
> 
> But yes, to me a uuid (a) feels like it should be generally useful
> including being settable and (b) not super duper 100% absolutely
> guaranteed to always be unique per boot, as an incremented counter
> would be.

OK, but a bunch of cats I found on the Internet agree with me, a UUID
shouldn't be settable:

https://en.wikipedia.org/wiki/Universally_unique_identifier

The key point being, if you can set the id, it can't be unique ... it
doesn't have to be random (some of the versions are time or other
unique object based properties) but it does have to be derived by
something that gives reasonably reliable uniqueness (which is why
humans aren't allowed to set them ... we're bad a choosing unique
labels).

> > settable and a monotonic counter sounds like it shouldn't?  Because
> > to me (coming I suppose from dealing with uuids in edk2) neither
> > sounds like it should be settable.
> 
> Huh - yes, in contrast, for virtualization based tests of secureboot
> and fs-y things we do indeed always set the uuids.

Well, the old msdos label had a settable ID that often caused problems
because it wasn't unique.  The new gpt partition label specifically
doesn't allow you to set the GUID label because it should be unique:

https://en.wikipedia.org/wiki/GUID_Partition_Table

I think some of the tools have emergency modes where you can set the
label "just in case" but they shouldn't

The point is that if you're using a settable uuid for containers,
you're doing the wrong thing ... it should either be a non-unique label
(in which case why not make it more human readable) or only the system
should be allowed to set it using the prescribed algorithm.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 22:56                 ` James Bottomley
@ 2021-11-29  1:59                   ` Serge E. Hallyn
  2021-11-29 13:49                     ` Stefan Berger
  0 siblings, 1 reply; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-29  1:59 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Stefan Berger, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sun, Nov 28, 2021 at 05:56:28PM -0500, James Bottomley wrote:
> On Sun, 2021-11-28 at 15:49 -0600, Serge E. Hallyn wrote:
> > On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
> > > On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > > > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> > > > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> [...]
> > > > > > So given that 'unique per boot' is sufficient, what would be
> > > > > > the problem with simply adding a simple ever-increasing
> > > > > > unique atomix count to the struct user_namespace?
> > > > > 
> > > > > I don't think there is any ... but I equally don't see why
> > > > > people would want to save and restore the uuid but not the new
> > > > > monotonic identifier ... because it's still just a marker on a
> > > > > namespace.
> > > > 
> > > > But you've called it "the namespace uuid".  I'm not even really
> > > > thinking of checkpoint/restart, just stopping and restarting a
> > > > container.  I'm convinced people will want to start using it
> > > > because, well, it is a nice feature.
> > > 
> > > Right, but the uniqueness property depends on you not being able to
> > > set it.  If you just want a namespace label, you can have that, but
> > > anything a user can set is either a pain to guarantee uniqueness
> > > (have to check all the other objects) or is simply a non-unique
> > > label.
> > > 
> > > If you want to label a container, which could have many namespaces
> > > and be stopped and restarted many times, it does sound like you
> > > want a non-unique settable label.  However, IMA definitely needs a
> > > guaranteed per namespace unique label.
> > > 
> > > Is the objection simply you think a UUID sound like it should be
> > 
> > Objection is too strong.  Concern.
> > 
> > But yes, to me a uuid (a) feels like it should be generally useful
> > including being settable and (b) not super duper 100% absolutely
> > guaranteed to always be unique per boot, as an incremented counter
> > would be.
> 
> OK, but a bunch of cats I found on the Internet agree with me, a UUID
> shouldn't be settable:
> 
> https://en.wikipedia.org/wiki/Universally_unique_identifier
> 
> The key point being, if you can set the id, it can't be unique ... it

Ok, so can you just put a comment above there saying "this must not
be settable from userspace" ?

> doesn't have to be random (some of the versions are time or other
> unique object based properties) but it does have to be derived by
> something that gives reasonably reliable uniqueness (which is why
> humans aren't allowed to set them ... we're bad a choosing unique
> labels).
> 
> > > settable and a monotonic counter sounds like it shouldn't?  Because
> > > to me (coming I suppose from dealing with uuids in edk2) neither
> > > sounds like it should be settable.
> > 
> > Huh - yes, in contrast, for virtualization based tests of secureboot
> > and fs-y things we do indeed always set the uuids.
> 
> Well, the old msdos label had a settable ID that often caused problems
> because it wasn't unique.  The new gpt partition label specifically
> doesn't allow you to set the GUID label because it should be unique:
> 
> https://en.wikipedia.org/wiki/GUID_Partition_Table
> 
> I think some of the tools have emergency modes where you can set the
> label "just in case" but they shouldn't
> 
> The point is that if you're using a settable uuid for containers,
> you're doing the wrong thing ... it should either be a non-unique label
> (in which case why not make it more human readable) or only the system
> should be allowed to set it using the prescribed algorithm.

It seems to me you're confuddling things to make a point.  You want
noone to use things that look like uuids for things where they
choose the uuid - yet there are many places today where that's being
done.  Even the partition types: https://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs
C12A7328-F81F-11D2-BA4B-00A0C93EC93B was chosen for the EFI partition, 
and you must "set" the partition type to that.  So it's set-able.

Anyway I'm still not seeing what using uuids buys you over using
a counter, but I'll stop debating it and look at the rest of the
set.

thanks,
-serge

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 2/3] ima: Namespace IMA
  2021-11-27 16:45 ` [RFC 2/3] ima: Namespace IMA James Bottomley
@ 2021-11-29  2:52   ` Serge E. Hallyn
  0 siblings, 0 replies; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-29  2:52 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Serge E . Hallyn, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sat, Nov 27, 2021 at 04:45:48PM +0000, James Bottomley wrote:
> Use the user namespace as the basis for namespacing IMA.  The
> implementation is very simple: add a new template called 'ima-ns'
> which exports the uuid of the user namespace into the IMA log.  This
> can be used to uniquely separate every container in the IMA log.
> 
> Note that the admin of the user namespace still cannot read back the
> IMA log because the IMA securityfs entries are not yet namespace
> aware.  However, root in the init_user_ns can read the log and see the
> containers.
> 
> You can get the uuid of init_user_ns from the boot_aggregate entry
> which is always the first one recorded in the log.  Any execution with
> a different uuid is in a new IMA namespace.
> 
> A sample of the log showing entry into a container is:
> 
> 10 7766c926c9db8dd4c923f96be5635b04593029c1 ima-ns sha256:0000000000000000000000000000000000000000000000000000000000000000 boot_aggregate 6582e360-1354-42b9-a6ef-ee1993d982da
> [...]
> 10 e0355132472d4d0ae1cc044412b4033bd5e1a48a ima-ns sha256:353e4d6b807056757fb5df31bafe7df80605bec20b445d5e9afd949ca4147d49 /usr/bin/unshare 6582e360-1354-42b9-a6ef-ee1993d982da
> 10 f257f5a12fd6e28d32b367a2e453c3badd0e8774 ima-ns sha256:2a7c66fc7e19acc100ee2b777b71179043fade8b81968828522cf31e6a96eaa7 /usr/bin/bash e496e384-4133-4d57-b93a-1812b83badf2
> 10 1bb206dbdf18f75e4515aeef378ba50e555a9291 ima-ns sha256:795fb52db49b211450c7242dbcad00d782a7b8174f669c259f74a7ccabe03a90 /usr/bin/id e496e384-4133-4d57-b93a-1812b83badf2
> 
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

Acked-by: Serge Hallyn <serge@hallyn.com>

> ---
>  include/linux/ima.h                       |  1 +
>  security/integrity/ima/Kconfig            |  6 +++++-
>  security/integrity/ima/ima_template.c     |  6 +++++-
>  security/integrity/ima/ima_template_lib.c | 24 ++++++++++++++++++++++-
>  security/integrity/ima/ima_template_lib.h |  4 ++++
>  5 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/ima.h b/include/linux/ima.h
> index b6ab66a546ae..09b14b73889e 100644
> --- a/include/linux/ima.h
> +++ b/include/linux/ima.h
> @@ -11,6 +11,7 @@
>  #include <linux/fs.h>
>  #include <linux/security.h>
>  #include <linux/kexec.h>
> +#include <linux/user_namespace.h>
>  #include <crypto/hash_info.h>
>  struct linux_binprm;
>  
> diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> index f3a9cc201c8c..4f0ce241b585 100644
> --- a/security/integrity/ima/Kconfig
> +++ b/security/integrity/ima/Kconfig
> @@ -69,7 +69,8 @@ choice
>  	  hash, defined as 20 bytes, and a null terminated pathname,
>  	  limited to 255 characters.  The 'ima-ng' measurement list
>  	  template permits both larger hash digests and longer
> -	  pathnames.
> +	  pathnames.  The 'ima-ns' adds the namespace uuid to the
> +	  'ima-ng' template.
>  
>  	config IMA_TEMPLATE
>  		bool "ima"
> @@ -77,6 +78,8 @@ choice
>  		bool "ima-ng (default)"
>  	config IMA_SIG_TEMPLATE
>  		bool "ima-sig"
> +	config IMA_NS_TEMPLATE
> +		bool "ima-ns"
>  endchoice
>  
>  config IMA_DEFAULT_TEMPLATE
> @@ -85,6 +88,7 @@ config IMA_DEFAULT_TEMPLATE
>  	default "ima" if IMA_TEMPLATE
>  	default "ima-ng" if IMA_NG_TEMPLATE
>  	default "ima-sig" if IMA_SIG_TEMPLATE
> +	default "ima-ns" if IMA_NS_TEMPLATE
>  
>  choice
>  	prompt "Default integrity hash algorithm"
> diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c
> index 694560396be0..14e02eb3d0f3 100644
> --- a/security/integrity/ima/ima_template.c
> +++ b/security/integrity/ima/ima_template.c
> @@ -24,6 +24,7 @@ static struct ima_template_desc builtin_templates[] = {
>  	{.name = "ima-modsig", .fmt = "d-ng|n-ng|sig|d-modsig|modsig"},
>  	{.name = "evm-sig",
>  	 .fmt = "d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode"},
> +	{.name = "ima-ns", .fmt = "d-ng|n-ng|ns"},
>  	{.name = "", .fmt = ""},	/* placeholder for a custom format */
>  };
>  
> @@ -64,6 +65,9 @@ static const struct ima_template_field supported_fields[] = {
>  	{.field_id = "xattrvalues",
>  	 .field_init = ima_eventinodexattrvalues_init,
>  	 .field_show = ima_show_template_sig},
> +	{.field_id = "ns",
> +	 .field_init = ima_ns_init,
> +	 .field_show = ima_show_template_uuid},
>  };
>  
>  /*
> @@ -72,7 +76,7 @@ static const struct ima_template_field supported_fields[] = {
>   * description as 'd-ng' and 'n-ng' respectively.
>   */
>  #define MAX_TEMPLATE_NAME_LEN \
> -	sizeof("d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode")
> +	sizeof("d-ng|n-ng|evmsig|xattrnames|xattrlengths|xattrvalues|iuid|igid|imode|ns")
>  
>  static struct ima_template_desc *ima_template;
>  static struct ima_template_desc *ima_buf_template;
> diff --git a/security/integrity/ima/ima_template_lib.c b/security/integrity/ima/ima_template_lib.c
> index ca017cae73eb..ebd54c1b5206 100644
> --- a/security/integrity/ima/ima_template_lib.c
> +++ b/security/integrity/ima/ima_template_lib.c
> @@ -26,7 +26,8 @@ enum data_formats {
>  	DATA_FMT_DIGEST_WITH_ALGO,
>  	DATA_FMT_STRING,
>  	DATA_FMT_HEX,
> -	DATA_FMT_UINT
> +	DATA_FMT_UINT,
> +	DATA_FMT_UUID,
>  };
>  
>  static int ima_write_template_field_data(const void *data, const u32 datalen,
> @@ -120,6 +121,9 @@ static void ima_show_template_data_ascii(struct seq_file *m,
>  			break;
>  		}
>  		break;
> +	case DATA_FMT_UUID:
> +		seq_printf(m, "%pU", buf_ptr);
> +		break;
>  	default:
>  		break;
>  	}
> @@ -202,6 +206,12 @@ void ima_show_template_uint(struct seq_file *m, enum ima_show_type show,
>  	ima_show_template_field_data(m, show, DATA_FMT_UINT, field_data);
>  }
>  
> +void ima_show_template_uuid(struct seq_file *m, enum ima_show_type show,
> +			    struct ima_field_data *field_data)
> +{
> +	ima_show_template_field_data(m, show, DATA_FMT_UUID, field_data);
> +}
> +
>  /**
>   * ima_parse_buf() - Parses lengths and data from an input buffer
>   * @bufstartp:       Buffer start address.
> @@ -685,3 +695,15 @@ int ima_eventinodexattrvalues_init(struct ima_event_data *event_data,
>  {
>  	return ima_eventinodexattrs_init_common(event_data, field_data, 'v');
>  }
> +
> +/*
> + *  ima_ns_init - include the namespace UUID as part of the template
> + *  data
> + */
> +int ima_ns_init(struct ima_event_data *event_data,
> +		struct ima_field_data *field_data)
> +{
> +	return ima_write_template_field_data(&current_user_ns()->uuid,
> +					     UUID_SIZE, DATA_FMT_UUID,
> +					     field_data);
> +}
> diff --git a/security/integrity/ima/ima_template_lib.h b/security/integrity/ima/ima_template_lib.h
> index c71f1de95753..6ea2156271ae 100644
> --- a/security/integrity/ima/ima_template_lib.h
> +++ b/security/integrity/ima/ima_template_lib.h
> @@ -29,6 +29,8 @@ void ima_show_template_buf(struct seq_file *m, enum ima_show_type show,
>  			   struct ima_field_data *field_data);
>  void ima_show_template_uint(struct seq_file *m, enum ima_show_type show,
>  			    struct ima_field_data *field_data);
> +void ima_show_template_uuid(struct seq_file *m, enum ima_show_type show,
> +			    struct ima_field_data *field_data);
>  int ima_parse_buf(void *bufstartp, void *bufendp, void **bufcurp,
>  		  int maxfields, struct ima_field_data *fields, int *curfields,
>  		  unsigned long *len_mask, int enforce_mask, char *bufname);
> @@ -62,4 +64,6 @@ int ima_eventinodexattrlengths_init(struct ima_event_data *event_data,
>  				    struct ima_field_data *field_data);
>  int ima_eventinodexattrvalues_init(struct ima_event_data *event_data,
>  				   struct ima_field_data *field_data);
> +int ima_ns_init(struct ima_event_data *event_data,
> +		struct ima_field_data *field_data);
>  #endif /* __LINUX_IMA_TEMPLATE_LIB_H */
> -- 
> 2.33.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-27 16:45 ` [RFC 3/3] ima: make the integrity inode cache per namespace James Bottomley
@ 2021-11-29  4:58   ` Serge E. Hallyn
  2021-11-29 12:50     ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-29  4:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Serge E . Hallyn, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
> Currently we get one entry in the IMA log per unique file event.  So,
> if you have a measurement policy and it measures a particular binary
> it will not get measured again if it is subsequently executed. For
> Namespaced IMA, the correct behaviour seems to be to log once per
> inode per namespace (so every unique execution in a namespace gets a
> separate log entry).  Since logging once per inode per namespace is

I suspect I'll need to do a more in depth reading of the existing
code, but I'll ask the lazy question anyway (since you say "the
correct behavior seems to be") - is it actually important that
files which were appraised under a parent namespace's policy already
should be logged again?  Since you used the word "log" I'm assuming
this isn't building a running hash like a tpm pcr where skipping one
would invalidate rmeote attestation?

> different from current behaviour, it is only activated if the
> namespace appears in the log template (so there's no behaviour change
> for any of the original templates).
> 
> Expand the iint cache to have a list of namespaces, per iint entry,
> the inode has been seen in by moving the action flags and the
> measured_pcrs into a per-namespace structure.  The lifetime of these
> additional list entries is tied to the lifetime of the iint entry and
> the namespace, so if either is deleted, the new entry is.
> 
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
> ---
>  include/linux/ima.h                       |  13 ++-
>  include/linux/user_namespace.h            |   5 +
>  kernel/user.c                             |   1 +
>  kernel/user_namespace.c                   |   6 ++
>  security/integrity/iint.c                 |   4 +-
>  security/integrity/ima/Makefile           |   2 +-
>  security/integrity/ima/ima.h              |  21 +++-
>  security/integrity/ima/ima_api.c          |   7 +-
>  security/integrity/ima/ima_main.c         |  21 ++--
>  security/integrity/ima/ima_ns.c           | 115 ++++++++++++++++++++++
>  security/integrity/ima/ima_policy.c       |   2 +-
>  security/integrity/ima/ima_template_lib.c |   2 +
>  security/integrity/integrity.h            |  11 ++-
>  13 files changed, 192 insertions(+), 18 deletions(-)
>  create mode 100644 security/integrity/ima/ima_ns.c
> 
> diff --git a/include/linux/ima.h b/include/linux/ima.h
> index 09b14b73889e..dbbc0257d065 100644
> --- a/include/linux/ima.h
> +++ b/include/linux/ima.h
> @@ -40,7 +40,8 @@ extern int ima_measure_critical_data(const char *event_label,
>  				     const char *event_name,
>  				     const void *buf, size_t buf_len,
>  				     bool hash, u8 *digest, size_t digest_len);
> -
> +extern void ima_init_user_ns(struct user_namespace *ns);
> +extern void ima_free_user_ns(struct user_namespace *ns);
>  #ifdef CONFIG_IMA_APPRAISE_BOOTPARAM
>  extern void ima_appraise_parse_cmdline(void);
>  #else
> @@ -154,6 +155,16 @@ static inline int ima_measure_critical_data(const char *event_label,
>  	return -ENOENT;
>  }
>  
> +static inline void ima_init_user_ns(struct user_namespace *ns)
> +{
> +	return;
> +}
> +
> +static inline void ima_free_user_ns(struct user_namespace *ns)
> +{
> +	return;
> +}
> +
>  #endif /* CONFIG_IMA */
>  
>  #ifndef CONFIG_IMA_KEXEC
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index d155788abdc1..52968764b195 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -8,6 +8,7 @@
>  #include <linux/sched.h>
>  #include <linux/workqueue.h>
>  #include <linux/rwsem.h>
> +#include <linux/rwlock.h>
>  #include <linux/sysctl.h>
>  #include <linux/err.h>
>  #include <linux/uuid.h>
> @@ -101,6 +102,10 @@ struct user_namespace {
>  	struct ucounts		*ucounts;
>  	long ucount_max[UCOUNT_COUNTS];
>  	uuid_t			uuid;
> +#ifdef CONFIG_IMA
> +	struct list_head	ima_inode_list;
> +	rwlock_t		ima_inode_list_lock;
> +#endif
>  } __randomize_layout;
>  
>  struct ucounts {
> diff --git a/kernel/user.c b/kernel/user.c
> index bf9ae1d0b670..670d3c41c817 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -68,6 +68,7 @@ struct user_namespace init_user_ns = {
>  	.keyring_sem = __RWSEM_INITIALIZER(init_user_ns.keyring_sem),
>  #endif
>  	/* .uuid is initialized in user_namespaces_init() */
> +	/* all IMA fields are initialized by ima_init_user_ns() */
>  };
>  EXPORT_SYMBOL_GPL(init_user_ns);
>  
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 8ce57c16ddd3..8afa5dc0b992 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  
>  #include <linux/export.h>
> +#include <linux/ima.h>
>  #include <linux/nsproxy.h>
>  #include <linux/slab.h>
>  #include <linux/sched/signal.h>
> @@ -144,6 +145,9 @@ int create_user_ns(struct cred *new)
>  	uuid_gen(&ns->uuid);
>  
>  	set_cred_user_ns(new, ns);
> +
> +	ima_init_user_ns(ns);
> +
>  	return 0;
>  fail_keyring:
>  #ifdef CONFIG_PERSISTENT_KEYRINGS
> @@ -200,6 +204,7 @@ static void free_user_ns(struct work_struct *work)
>  		}
>  		retire_userns_sysctls(ns);
>  		key_free_user_ns(ns);
> +		ima_free_user_ns(ns);
>  		ns_free_inum(&ns->ns);
>  		kmem_cache_free(user_ns_cachep, ns);
>  		dec_user_namespaces(ucounts);
> @@ -1389,6 +1394,7 @@ static __init int user_namespaces_init(void)
>  {
>  	user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC | SLAB_ACCOUNT);
>  	uuid_gen(&init_user_ns.uuid);
> +	ima_init_user_ns(&init_user_ns);
>  	return 0;
>  }
>  subsys_initcall(user_namespaces_init);
> diff --git a/security/integrity/iint.c b/security/integrity/iint.c
> index 8638976f7990..f714532feb7d 100644
> --- a/security/integrity/iint.c
> +++ b/security/integrity/iint.c
> @@ -71,6 +71,7 @@ struct integrity_iint_cache *integrity_iint_find(struct inode *inode)
>  static void iint_free(struct integrity_iint_cache *iint)
>  {
>  	kfree(iint->ima_hash);
> +	ima_ns_iint_list_free(iint);
>  	iint->ima_hash = NULL;
>  	iint->version = 0;
>  	iint->flags = 0UL;
> @@ -81,7 +82,7 @@ static void iint_free(struct integrity_iint_cache *iint)
>  	iint->ima_read_status = INTEGRITY_UNKNOWN;
>  	iint->ima_creds_status = INTEGRITY_UNKNOWN;
>  	iint->evm_status = INTEGRITY_UNKNOWN;
> -	iint->measured_pcrs = 0;
> +	INIT_LIST_HEAD(&iint->ns_list);
>  	kmem_cache_free(iint_cache, iint);
>  }
>  
> @@ -170,6 +171,7 @@ static void init_once(void *foo)
>  	iint->ima_creds_status = INTEGRITY_UNKNOWN;
>  	iint->evm_status = INTEGRITY_UNKNOWN;
>  	mutex_init(&iint->mutex);
> +	INIT_LIST_HEAD(&iint->ns_list);
>  }
>  
>  static int __init integrity_iintcache_init(void)
> diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
> index 2499f2485c04..1741aa5d97bc 100644
> --- a/security/integrity/ima/Makefile
> +++ b/security/integrity/ima/Makefile
> @@ -7,7 +7,7 @@
>  obj-$(CONFIG_IMA) += ima.o
>  
>  ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
> -	 ima_policy.o ima_template.o ima_template_lib.o
> +	 ima_policy.o ima_template.o ima_template_lib.o ima_ns.o
>  ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
>  ima-$(CONFIG_IMA_APPRAISE_MODSIG) += ima_modsig.o
>  ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index be965a8715e4..047256be7195 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -119,6 +119,16 @@ struct ima_kexec_hdr {
>  	u64 count;
>  };
>  
> +/* IMA cache of per-user-namespace flags */
> +struct ima_ns_cache {
> +	struct user_namespace *ns;
> +	struct integrity_iint_cache *iint;
> +	struct list_head ns_list;
> +	struct list_head iint_list;
> +	unsigned long measured_pcrs;
> +	unsigned long flags;
> +};
> +
>  extern const int read_idmap[];
>  
>  #ifdef CONFIG_HAVE_IMA_KEXEC
> @@ -263,7 +273,8 @@ int ima_must_measure(struct inode *inode, int mask, enum ima_hooks func);
>  int ima_collect_measurement(struct integrity_iint_cache *iint,
>  			    struct file *file, void *buf, loff_t size,
>  			    enum hash_algo algo, struct modsig *modsig);
> -void ima_store_measurement(struct integrity_iint_cache *iint, struct file *file,
> +void ima_store_measurement(struct integrity_iint_cache *iint,
> +			   struct ima_ns_cache *nsc, struct file *file,
>  			   const unsigned char *filename,
>  			   struct evm_ima_xattr_data *xattr_value,
>  			   int xattr_len, const struct modsig *modsig, int pcr,
> @@ -301,6 +312,14 @@ void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos);
>  void ima_policy_stop(struct seq_file *m, void *v);
>  int ima_policy_show(struct seq_file *m, void *v);
>  
> +/* IMA Namespace related functions */
> +extern bool ima_ns_in_template;
> +struct ima_ns_cache *ima_ns_cache_get(struct integrity_iint_cache *iint,
> +				      struct user_namespace *ns);
> +void ima_ns_cache_clear(struct integrity_iint_cache *iint);
> +void ima_init_ns(void);
> +
> +
>  /* Appraise integrity measurements */
>  #define IMA_APPRAISE_ENFORCE	0x01
>  #define IMA_APPRAISE_FIX	0x02
> diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
> index a64fb0130b01..d613ea1ee378 100644
> --- a/security/integrity/ima/ima_api.c
> +++ b/security/integrity/ima/ima_api.c
> @@ -298,6 +298,7 @@ int ima_collect_measurement(struct integrity_iint_cache *iint,
>   * Must be called with iint->mutex held.
>   */
>  void ima_store_measurement(struct integrity_iint_cache *iint,
> +			   struct ima_ns_cache *nsc,
>  			   struct file *file, const unsigned char *filename,
>  			   struct evm_ima_xattr_data *xattr_value,
>  			   int xattr_len, const struct modsig *modsig, int pcr,
> @@ -322,7 +323,7 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
>  	 * appraisal, but a file measurement from earlier might already exist in
>  	 * the measurement list.
>  	 */
> -	if (iint->measured_pcrs & (0x1 << pcr) && !modsig)
> +	if (nsc->measured_pcrs & (0x1 << pcr) && !modsig)
>  		return;
>  
>  	result = ima_alloc_init_template(&event_data, &entry, template_desc);
> @@ -334,8 +335,8 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
>  
>  	result = ima_store_template(entry, violation, inode, filename, pcr);
>  	if ((!result || result == -EEXIST) && !(file->f_flags & O_DIRECT)) {
> -		iint->flags |= IMA_MEASURED;
> -		iint->measured_pcrs |= (0x1 << pcr);
> +		nsc->flags |= IMA_MEASURED;
> +		nsc->measured_pcrs |= (0x1 << pcr);
>  	}
>  	if (result < 0)
>  		ima_free_template_entry(entry);
> diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> index 465865412100..049710203fac 100644
> --- a/security/integrity/ima/ima_main.c
> +++ b/security/integrity/ima/ima_main.c
> @@ -168,8 +168,8 @@ static void ima_check_last_writer(struct integrity_iint_cache *iint,
>  		if (!IS_I_VERSION(inode) ||
>  		    !inode_eq_iversion(inode, iint->version) ||
>  		    (iint->flags & IMA_NEW_FILE)) {
> -			iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
> -			iint->measured_pcrs = 0;
> +			iint->flags &= ~IMA_NEW_FILE;
> +			ima_ns_cache_clear(iint);
>  			if (update)
>  				ima_update_xattr(iint, file);
>  		}
> @@ -204,6 +204,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
>  {
>  	struct inode *inode = file_inode(file);
>  	struct integrity_iint_cache *iint = NULL;
> +	struct ima_ns_cache *nsc = NULL;
>  	struct ima_template_desc *template_desc = NULL;
>  	char *pathbuf = NULL;
>  	char filename[NAME_MAX];
> @@ -274,20 +275,20 @@ static int process_measurement(struct file *file, const struct cred *cred,
>  	    ((inode->i_sb->s_iflags & SB_I_IMA_UNVERIFIABLE_SIGNATURE) &&
>  	     !(inode->i_sb->s_iflags & SB_I_UNTRUSTED_MOUNTER) &&
>  	     !(action & IMA_FAIL_UNVERIFIABLE_SIGS))) {
> -		iint->flags &= ~IMA_DONE_MASK;
> -		iint->measured_pcrs = 0;
> +		ima_ns_cache_clear(iint);
>  	}
> +	nsc = ima_ns_cache_get(iint, current_user_ns());
>  
>  	/* Determine if already appraised/measured based on bitmask
>  	 * (IMA_MEASURE, IMA_MEASURED, IMA_XXXX_APPRAISE, IMA_XXXX_APPRAISED,
>  	 *  IMA_AUDIT, IMA_AUDITED)
>  	 */
> -	iint->flags |= action;
> +	nsc->flags |= action;
>  	action &= IMA_DO_MASK;
> -	action &= ~((iint->flags & (IMA_DONE_MASK ^ IMA_MEASURED)) >> 1);
> +	action &= ~((nsc->flags & (IMA_DONE_MASK ^ IMA_MEASURED)) >> 1);
>  
>  	/* If target pcr is already measured, unset IMA_MEASURE action */
> -	if ((action & IMA_MEASURE) && (iint->measured_pcrs & (0x1 << pcr)))
> +	if ((action & IMA_MEASURE) && (nsc->measured_pcrs & (0x1 << pcr)))
>  		action ^= IMA_MEASURE;
>  
>  	/* HASH sets the digital signature and update flags, nothing else */
> @@ -297,7 +298,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
>  		if ((xattr_value && xattr_len > 2) &&
>  		    (xattr_value->type == EVM_IMA_XATTR_DIGSIG))
>  			set_bit(IMA_DIGSIG, &iint->atomic_flags);
> -		iint->flags |= IMA_HASHED;
> +		nsc->flags |= IMA_HASHED;
>  		action ^= IMA_HASH;
>  		set_bit(IMA_UPDATE_XATTR, &iint->atomic_flags);
>  	}
> @@ -342,7 +343,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
>  		pathname = ima_d_path(&file->f_path, &pathbuf, filename);
>  
>  	if (action & IMA_MEASURE)
> -		ima_store_measurement(iint, file, pathname,
> +		ima_store_measurement(iint, nsc, file, pathname,
>  				      xattr_value, xattr_len, modsig, pcr,
>  				      template_desc);
>  	if (rc == 0 && (action & IMA_APPRAISE_SUBMASK)) {
> @@ -1047,6 +1048,8 @@ static int __init init_ima(void)
>  	if (error)
>  		return error;
>  
> +	ima_init_ns();
> +
>  	error = register_blocking_lsm_notifier(&ima_lsm_policy_notifier);
>  	if (error)
>  		pr_warn("Couldn't register LSM notifier, error %d\n", error);
> diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
> new file mode 100644
> index 000000000000..7134ba7f3544
> --- /dev/null
> +++ b/security/integrity/ima/ima_ns.c
> @@ -0,0 +1,115 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +/*
> + * IMA Namespace Support routines
> + */
> +
> +#include <linux/ima.h>
> +#include <linux/user_namespace.h>
> +
> +#include "ima.h"
> +
> +static struct kmem_cache *ns_cache __read_mostly;
> +bool ima_ns_in_template __read_mostly;
> +
> +void __init ima_init_ns(void)
> +{
> +	ns_cache = KMEM_CACHE(ima_ns_cache, SLAB_PANIC);
> +}
> +
> +void ima_init_user_ns(struct user_namespace *ns)
> +{
> +	INIT_LIST_HEAD(&ns->ima_inode_list);
> +}
> +
> +void ima_free_user_ns(struct user_namespace *ns)
> +{
> +	struct ima_ns_cache *entry;
> +
> +	/* no refs to ns left, so no need to lock */
> +	while (!list_empty(&ns->ima_inode_list)) {
> +		entry = list_entry(ns->ima_inode_list.next, struct ima_ns_cache,
> +				   ns_list);
> +
> +		/* iint cache entry is still active to lock to delete */
> +		write_lock(&entry->iint->ns_list_lock);
> +		list_del(&entry->iint_list);
> +		write_unlock(&entry->iint->ns_list_lock);
> +
> +		list_del(&entry->ns_list);
> +		kmem_cache_free(ns_cache, entry);
> +	}
> +}
> +
> +struct ima_ns_cache *ima_ns_cache_get(struct integrity_iint_cache *iint,
> +				      struct user_namespace *ns)
> +{
> +	struct ima_ns_cache *entry = NULL;
> +
> +	if (!ima_ns_in_template)
> +		/*
> +		 * if we're not logging the namespace, don't separate the
> +		 * iint cache per namespace.  This preserves original
> +		 * behaviour for the non-ns case.
> +		 */
> +		ns = &init_user_ns;
> +
> +	read_lock(&iint->ns_list_lock);
> +	list_for_each_entry(entry, &iint->ns_list, ns_list)
> +		if (entry->ns == ns)
> +			break;
> +	read_unlock(&iint->ns_list_lock);
> +
> +	if (entry && entry->ns == ns)
> +		return entry;
> +
> +	entry = kmem_cache_zalloc(ns_cache, GFP_NOFS);
> +	if (!entry)
> +		return NULL;
> +
> +	/* no refs taken: entry is freed on either ns delete or iint delete */
> +	entry->ns = ns;
> +	entry->iint = iint;
> +
> +	write_lock(&iint->ns_list_lock);
> +	list_add_tail(&entry->iint_list, &iint->ns_list);
> +	write_unlock(&iint->ns_list_lock);
> +
> +	write_lock(&ns->ima_inode_list_lock);
> +	list_add_tail(&entry->ns_list, &ns->ima_inode_list);
> +	write_unlock(&ns->ima_inode_list_lock);
> +
> +	return entry;
> +}
> +
> +/* clear the flags and measured PCR for every entry in the iint  */
> +void ima_ns_cache_clear(struct integrity_iint_cache *iint)
> +{
> +	struct ima_ns_cache *entry;
> +
> +	read_lock(&iint->ns_list_lock);
> +	list_for_each_entry(entry, &iint->ns_list, ns_list) {
> +		entry->flags = 0;
> +		entry->measured_pcrs = 0;
> +	}
> +	read_unlock(&iint->ns_list_lock);
> +}
> +
> +void ima_ns_iint_list_free(struct integrity_iint_cache *iint)
> +{
> +	struct ima_ns_cache *entry;
> +
> +	/* iint locking unnecessary; no-one should have acces to the list */
> +	while (!list_empty(&iint->ns_list)) {
> +		entry = list_entry(iint->ns_list.next, struct ima_ns_cache,
> +				   iint_list);
> +
> +		/* namespace is still active to lock to delete */
> +		write_lock(&entry->ns->ima_inode_list_lock);
> +		list_del(&entry->ns_list);
> +		write_unlock(&entry->ns->ima_inode_list_lock);
> +
> +		list_del(&entry->iint_list);
> +		kmem_cache_free(ns_cache, entry);
> +	}
> +}
> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
> index 320ca80aacab..9434a1064da6 100644
> --- a/security/integrity/ima/ima_policy.c
> +++ b/security/integrity/ima/ima_policy.c
> @@ -50,7 +50,7 @@
>  #define DONT_HASH	0x0200
>  
>  #define INVALID_PCR(a) (((a) < 0) || \
> -	(a) >= (sizeof_field(struct integrity_iint_cache, measured_pcrs) * 8))
> +	(a) >= (sizeof_field(struct ima_ns_cache, measured_pcrs) * 8))
>  
>  int ima_policy_flag;
>  static int temp_ima_appraise;
> diff --git a/security/integrity/ima/ima_template_lib.c b/security/integrity/ima/ima_template_lib.c
> index ebd54c1b5206..d1a381add127 100644
> --- a/security/integrity/ima/ima_template_lib.c
> +++ b/security/integrity/ima/ima_template_lib.c
> @@ -703,6 +703,8 @@ int ima_eventinodexattrvalues_init(struct ima_event_data *event_data,
>  int ima_ns_init(struct ima_event_data *event_data,
>  		struct ima_field_data *field_data)
>  {
> +	if (unlikely(!ima_ns_in_template))
> +		ima_ns_in_template = true;
>  	return ima_write_template_field_data(&current_user_ns()->uuid,
>  					     UUID_SIZE, DATA_FMT_UUID,
>  					     field_data);
> diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
> index 547425c20e11..8755635da3ff 100644
> --- a/security/integrity/integrity.h
> +++ b/security/integrity/integrity.h
> @@ -129,7 +129,6 @@ struct integrity_iint_cache {
>  	struct inode *inode;	/* back pointer to inode in question */
>  	u64 version;		/* track inode changes */
>  	unsigned long flags;
> -	unsigned long measured_pcrs;
>  	unsigned long atomic_flags;
>  	enum integrity_status ima_file_status:4;
>  	enum integrity_status ima_mmap_status:4;
> @@ -138,6 +137,8 @@ struct integrity_iint_cache {
>  	enum integrity_status ima_creds_status:4;
>  	enum integrity_status evm_status:4;
>  	struct ima_digest_data *ima_hash;
> +	struct list_head ns_list; /* list of namespaces inode seen in */
> +	rwlock_t ns_list_lock;
>  };
>  
>  /* rbtree tree calls to lookup, insert, delete
> @@ -225,6 +226,14 @@ static inline void ima_load_x509(void)
>  }
>  #endif
>  
> +#ifdef CONFIG_IMA
> +void ima_ns_iint_list_free(struct integrity_iint_cache *iint);
> +#else
> +void ima_ns_iint_list_free(struct integrity_iint_cache *iint)
> +{
> +}
> +#endif
> +
>  #ifdef CONFIG_EVM_LOAD_X509
>  void __init evm_load_x509(void);
>  #else
> -- 
> 2.33.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29  4:58   ` Serge E. Hallyn
@ 2021-11-29 12:50     ` James Bottomley
  2021-11-29 13:53       ` Stefan Berger
  0 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-29 12:50 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
> > Currently we get one entry in the IMA log per unique file
> > event.  So, if you have a measurement policy and it measures a
> > particular binary it will not get measured again if it is
> > subsequently executed. For Namespaced IMA, the correct behaviour
> > seems to be to log once per inode per namespace (so every unique
> > execution in a namespace gets a separate log entry).  Since logging
> > once per inode per namespace is
> 
> I suspect I'll need to do a more in depth reading of the existing
> code, but I'll ask the lazy question anyway (since you say "the
> correct behavior seems to be") - is it actually important that
> files which were appraised under a parent namespace's policy already
> should be logged again?

I think so.  For a couple of reasons, assuming the namespace eventually
gets its own log entries, which the next incremental patch proposed to
do by virtualizing the securityfs entries.  If you don't do this:

   1. The namespace log will be incomplete in a random way depending on
      what execution preceded it.  This violates the principle of least
      surprise, because you at least expect the IMA log to be consistent
      per set of executions.
   2. A malicious namespace owner can use the missing information in their
      log to infer the execution of others, which is an information leak
      IMA tries to prevent by keeping the log readable only by root.

>   Since you used the word "log" I'm assuming this isn't building a
> running hash like a tpm pcr where skipping one would invalidate
> rmeote attestation?

Their is only one IMA log, so each entry must be extended through the
PCR to keep the quote correct.  That means each per namespace entry
does extend the PCR.  However, since the entire entry is logged and the
namespace uuid is part of the entry, the entry hash is different for
each, so the IMA rule about duplicate log entries doesn't apply, if
that was the worry?

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-28 21:49               ` Serge E. Hallyn
  2021-11-28 22:56                 ` James Bottomley
@ 2021-11-29 13:12                 ` Christian Brauner
  2021-11-29 13:46                   ` James Bottomley
  1 sibling, 1 reply; 46+ messages in thread
From: Christian Brauner @ 2021-11-29 13:12 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: James Bottomley, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Stefan Berger, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Sun, Nov 28, 2021 at 03:49:06PM -0600, Serge Hallyn wrote:
> On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
> > On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> > > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> > > > > On Sun, Nov 28, 2021 at 08:29:21AM -0500, James Bottomley wrote:
> > > > > > On Sat, 2021-11-27 at 22:45 -0600, Serge E. Hallyn wrote:
> > > > > > > On Sat, Nov 27, 2021 at 04:45:47PM +0000, James Bottomley
> > > > > > > wrote:
> > > > > > > > As a precursor to namespacing IMA a way of uniquely
> > > > > > > > identifying the namespace to appear in the IMA log is
> > > > > > > > needed.  This log may be transported away from the running
> > > > > > > > system and may be analyzed even after the system has been
> > > > > > > > rebooted.  Thus we need a way of identifying namespaces in
> > > > > > > > the log which is unique.  UUID, being designed
> > > > > > > > probabilistically never to repeat, fits this bill
> > > > > > > > so add it to the user_namespace which we'll also use for
> > > > > > > > namespacing IMA.
> > > > > > > 
> > > > > > > If the logs run across 5 boots, is it important to you that
> > > > > > > the uuid be unique across all 5 boots?  Would it suffice to
> > > > > > > have a per-boot unique count and report that plus some
> > > > > > > indicator of the current boot (like boot time in jiffies)?
> > > > > > 
> > > > > > For the purposes of IMA it's only really important to have the
> > > > > > uuid be unique within the particular log ... i.e. unique per
> > > > > > boot.  However, given the prevalence of uuids elsewhere and the
> > > > > > fact we have no current per-boot unique label for the namespace
> > > > > > (the inode number could repeat), it seemed reasonable to employ
> > > > > > uuids for this rather than invent a different identifier.  Plus
> > > > > > IMA isn't going to complain if we have a globally unique
> > > > > > identifier
> > > > > > ...
> > > > > 
> > > > > Ok - Note I'm not saying I heavily object, but I'm mildly
> > > > > concerned about users who happen to spin off a lot of user
> > > > > namespaces for quick jobs being penalized.
> > > > 
> > > > Well, that's why I use the uuid_gen coupled to prandom ... there
> > > > shouldn't be a measurable overhead generating it.
> > > 
> > > Does prandom have *no*, or just little effect on the entopy pool?
> > > Tried briefly looking at prandom_u32, not quite getting how it's
> > > using net_rand_state - it reads it and uses it but doesn't make
> > > any changes to it?
> > 
> > It has a first use effect to get the seed but once that happens it has
> > no further effect on the entropy pool.
> 
> Gotcha - thanks.
> 
> > > > >   I suspect Eric will also worry about the namespacing
> > > > > implications - i.e. people *will* want to start restoring user
> > > > > namespaces with a previously used uuid.
> > > > 
> > > > So this is a problem I tried to address in the last paragraph.  If
> > > > I put any marker on a namespace, people are potentially going to
> > > > want to save and restore it. The bottom line is that ima logs are
> > > > add only.  You can't save and restore them so we're already dealing
> > > > with something that can't be CRIU transported.  I had hoped that it
> > > > would be obvious that a randomly generated uuid, whose uniqueness
> > > > depends on random generation likewise can't be saved and restored
> > > > because we'd have no way to prevent a clash.
> > > 
> > > Yes but you're making this a general user_namespace struct member.
> > > So once that's there people will want to export it, use it for
> > > things other than ima.
> > 
> > Yes, that's why I did it.  However, the property of uniqueness for all
> > uuid type things depends on randomness, so ipso facto, they can never
> > be settable.
> > 
> > > > > So given that 'unique per boot' is sufficient, what would be the
> > > > > problem with simply adding a simple ever-increasing unique atomix
> > > > > count to the struct user_namespace?
> > > > 
> > > > I don't think there is any ... but I equally don't see why people
> > > > would want to save and restore the uuid but not the new monotonic
> > > > identifier ... because it's still just a marker on a namespace.
> > > 
> > > But you've called it "the namespace uuid".  I'm not even really
> > > thinking of checkpoint/restart, just stopping and restarting a
> > > container.  I'm convinced people will want to start using it because,
> > > well, it is a nice feature.
> > 
> > Right, but the uniqueness property depends on you not being able to set
> > it.  If you just want a namespace label, you can have that, but
> > anything a user can set is either a pain to guarantee uniqueness (have
> > to check all the other objects) or is simply a non-unique label.
> > 
> > If you want to label a container, which could have many namespaces and
> > be stopped and restarted many times, it does sound like you want a non-
> > unique settable label.  However, IMA definitely needs a guaranteed per
> > namespace unique label.
> > 
> > Is the objection simply you think a UUID sound like it should be
> 
> Objection is too strong.  Concern.
> 
> But yes, to me a uuid (a) feels like it should be generally useful
> including being settable and (b) not super duper 100% absolutely
> guaranteed to always be unique per boot, as an incremented counter
> would be.

I don't have strong feelings about uuid or counter. In something like an
IMA log a uuid might be better but idk. I don't know enough about IMA to
judge that.
I see the problem you're getting at though. Making this a member of
struct user_namespace will give the impression that is a generic
identifier. I'd rather make it clear that this is an IMA-only thing.
Ideally by not having it be a member of struct user_namespace at all or
at least by naming it ima_uuid or similar.

Christian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-29 13:12                 ` Christian Brauner
@ 2021-11-29 13:46                   ` James Bottomley
  0 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-29 13:46 UTC (permalink / raw)
  To: Christian Brauner, Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Stefan Berger, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Mon, 2021-11-29 at 14:12 +0100, Christian Brauner wrote:
> On Sun, Nov 28, 2021 at 03:49:06PM -0600, Serge Hallyn wrote:
> > On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
> > > On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > > > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley
> > > > wrote:
> > > > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
[...]
> > > > > > So given that 'unique per boot' is sufficient, what would
> > > > > > be the problem with simply adding a simple ever-increasing
> > > > > > unique atomix count to the struct user_namespace?
> > > > > 
> > > > > I don't think there is any ... but I equally don't see why
> > > > > people would want to save and restore the uuid but not the
> > > > > new monotonic identifier ... because it's still just a marker
> > > > > on a namespace.
> > > > 
> > > > But you've called it "the namespace uuid".  I'm not even really
> > > > thinking of checkpoint/restart, just stopping and restarting a
> > > > container.  I'm convinced people will want to start using it
> > > > because, well, it is a nice feature.
> > > 
> > > Right, but the uniqueness property depends on you not being able
> > > to set it.  If you just want a namespace label, you can have
> > > that, but anything a user can set is either a pain to guarantee
> > > uniqueness (have to check all the other objects) or is simply a
> > > non-unique label.
> > > 
> > > If you want to label a container, which could have many
> > > namespaces and be stopped and restarted many times, it does sound
> > > like you want a non-unique settable label.  However, IMA
> > > definitely needs a guaranteed per namespace unique label.
> > > 
> > > Is the objection simply you think a UUID sound like it should be
> > 
> > Objection is too strong.  Concern.
> > 
> > But yes, to me a uuid (a) feels like it should be generally useful
> > including being settable and (b) not super duper 100% absolutely
> > guaranteed to always be unique per boot, as an incremented counter
> > would be.
> 
> I don't have strong feelings about uuid or counter. In something like
> an IMA log a uuid might be better but idk. I don't know enough about
> IMA to judge that. I see the problem you're getting at though. Making
> this a member of struct user_namespace will give the impression that
> is a generic identifier. I'd rather make it clear that this is an
> IMA-only thing.

I don't think that's possible; let's unwind and look at the
requirements:  We need a per log (i.e. per boot) unique label, that the
user of the system can't tamper with, for the namespace to record in
the IMA log.  The chances are orchestration systems are likely going to
want a way to query the label so they can tie their own internal name
for a container that is using namespaced IMA to the entry in the log. 
Given these two requirements, it can't be an IMA only thing, it has to
be some sort of marker on the namespace itself.

> Ideally by not having it be a member of struct user_namespace at all
> or at least by naming it ima_uuid or similar.

I can call it ima_uuid, but Serge is right, given people are going to
want to get it, we're not going to be able to prevent it being used for
non-ima things.  The main thing I was trying to make clear is the
unique per boot and tamper proof property means there can't be an API
to set it.

Given the assumption people will use it for other things, it is
necessary to think about the wider issues before settling on a scheme.

From the security perspective I like UUIDs because they would make each
IMA log unique which is a useful property to prevent replay or log
faking, but I think a unique per boot counter would also work.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-29  1:59                   ` Serge E. Hallyn
@ 2021-11-29 13:49                     ` Stefan Berger
  2021-11-29 13:56                       ` Christian Brauner
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 13:49 UTC (permalink / raw)
  To: Serge E. Hallyn, James Bottomley
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner


On 11/28/21 20:59, Serge E. Hallyn wrote:
> On Sun, Nov 28, 2021 at 05:56:28PM -0500, James Bottomley wrote:
>> On Sun, 2021-11-28 at 15:49 -0600, Serge E. Hallyn wrote:
>>> On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
>>>> On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
>>>>> On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
>>>>>> On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
>> [...]
>>>>>>> So given that 'unique per boot' is sufficient, what would be
>>>>>>> the problem with simply adding a simple ever-increasing
>>>>>>> unique atomix count to the struct user_namespace?
>>>>>> I don't think there is any ... but I equally don't see why
>>>>>> people would want to save and restore the uuid but not the new
>>>>>> monotonic identifier ... because it's still just a marker on a
>>>>>> namespace.
>>>>> But you've called it "the namespace uuid".  I'm not even really
>>>>> thinking of checkpoint/restart, just stopping and restarting a
>>>>> container.  I'm convinced people will want to start using it
>>>>> because, well, it is a nice feature.
>>>> Right, but the uniqueness property depends on you not being able to
>>>> set it.  If you just want a namespace label, you can have that, but
>>>> anything a user can set is either a pain to guarantee uniqueness
>>>> (have to check all the other objects) or is simply a non-unique
>>>> label.
>>>>
>>>> If you want to label a container, which could have many namespaces
>>>> and be stopped and restarted many times, it does sound like you
>>>> want a non-unique settable label.  However, IMA definitely needs a
>>>> guaranteed per namespace unique label.
>>>>
>>>> Is the objection simply you think a UUID sound like it should be
>>> Objection is too strong.  Concern.
>>>
>>> But yes, to me a uuid (a) feels like it should be generally useful
>>> including being settable and (b) not super duper 100% absolutely
>>> guaranteed to always be unique per boot, as an incremented counter
>>> would be.
>> OK, but a bunch of cats I found on the Internet agree with me, a UUID
>> shouldn't be settable:
>>
>> https://en.wikipedia.org/wiki/Universally_unique_identifier
>>
>> The key point being, if you can set the id, it can't be unique ... it
> Ok, so can you just put a comment above there saying "this must not
> be settable from userspace" ?

So I have been working on an IMA namespacing series again as well and 
would like to use some sort of unique identifier for audit messages 
emitted from an IMA/user namespace other than the init_ima_ns. This UUID 
may just work for this, but how would one associate the UUID with a 
container if it ever comes to that when evaluating audit logs? Shouldn't 
it be settable from user space for some sort of 'coordination' between 
container runtime and kernel?

    Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 12:50     ` James Bottomley
@ 2021-11-29 13:53       ` Stefan Berger
  2021-11-29 14:10         ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 13:53 UTC (permalink / raw)
  To: James Bottomley, Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner


On 11/29/21 07:50, James Bottomley wrote:
> On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
>> On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
>>> Currently we get one entry in the IMA log per unique file
>>> event.  So, if you have a measurement policy and it measures a
>>> particular binary it will not get measured again if it is
>>> subsequently executed. For Namespaced IMA, the correct behaviour
>>> seems to be to log once per inode per namespace (so every unique
>>> execution in a namespace gets a separate log entry).  Since logging
>>> once per inode per namespace is
>> I suspect I'll need to do a more in depth reading of the existing
>> code, but I'll ask the lazy question anyway (since you say "the
>> correct behavior seems to be") - is it actually important that
>> files which were appraised under a parent namespace's policy already
>> should be logged again?
> I think so.  For a couple of reasons, assuming the namespace eventually
> gets its own log entries, which the next incremental patch proposed to
> do by virtualizing the securityfs entries.  If you don't do this:

To avoid duplicate efforts, an implementation of a virtualized 
securityfs is in this series here:

https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3

It starts with 'securityfs: Prefix global variables with secruityfs_'

    Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-29 13:49                     ` Stefan Berger
@ 2021-11-29 13:56                       ` Christian Brauner
  2021-11-29 14:19                         ` Stefan Berger
  2021-11-30 13:09                         ` James Bottomley
  0 siblings, 2 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-29 13:56 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Serge E. Hallyn, James Bottomley, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 08:49:40AM -0500, Stefan Berger wrote:
> 
> On 11/28/21 20:59, Serge E. Hallyn wrote:
> > On Sun, Nov 28, 2021 at 05:56:28PM -0500, James Bottomley wrote:
> > > On Sun, 2021-11-28 at 15:49 -0600, Serge E. Hallyn wrote:
> > > > On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
> > > > > On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > > > > > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
> > > > > > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
> > > [...]
> > > > > > > > So given that 'unique per boot' is sufficient, what would be
> > > > > > > > the problem with simply adding a simple ever-increasing
> > > > > > > > unique atomix count to the struct user_namespace?
> > > > > > > I don't think there is any ... but I equally don't see why
> > > > > > > people would want to save and restore the uuid but not the new
> > > > > > > monotonic identifier ... because it's still just a marker on a
> > > > > > > namespace.
> > > > > > But you've called it "the namespace uuid".  I'm not even really
> > > > > > thinking of checkpoint/restart, just stopping and restarting a
> > > > > > container.  I'm convinced people will want to start using it
> > > > > > because, well, it is a nice feature.
> > > > > Right, but the uniqueness property depends on you not being able to
> > > > > set it.  If you just want a namespace label, you can have that, but
> > > > > anything a user can set is either a pain to guarantee uniqueness
> > > > > (have to check all the other objects) or is simply a non-unique
> > > > > label.
> > > > > 
> > > > > If you want to label a container, which could have many namespaces
> > > > > and be stopped and restarted many times, it does sound like you
> > > > > want a non-unique settable label.  However, IMA definitely needs a
> > > > > guaranteed per namespace unique label.
> > > > > 
> > > > > Is the objection simply you think a UUID sound like it should be
> > > > Objection is too strong.  Concern.
> > > > 
> > > > But yes, to me a uuid (a) feels like it should be generally useful
> > > > including being settable and (b) not super duper 100% absolutely
> > > > guaranteed to always be unique per boot, as an incremented counter
> > > > would be.
> > > OK, but a bunch of cats I found on the Internet agree with me, a UUID
> > > shouldn't be settable:
> > > 
> > > https://en.wikipedia.org/wiki/Universally_unique_identifier
> > > 
> > > The key point being, if you can set the id, it can't be unique ... it
> > Ok, so can you just put a comment above there saying "this must not
> > be settable from userspace" ?
> 
> So I have been working on an IMA namespacing series again as well and would
> like to use some sort of unique identifier for audit messages emitted from
> an IMA/user namespace other than the init_ima_ns. This UUID may just work
> for this, but how would one associate the UUID with a container if it ever
> comes to that when evaluating audit logs? Shouldn't it be settable from user
> space for some sort of 'coordination' between container runtime and kernel?

Wouldn't this be solved by the audit container id patchset? In fact,
can't we use this for IMA as well?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 13:53       ` Stefan Berger
@ 2021-11-29 14:10         ` James Bottomley
  2021-11-29 14:22           ` Christian Brauner
  2021-11-29 14:30           ` Stefan Berger
  0 siblings, 2 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-29 14:10 UTC (permalink / raw)
  To: Stefan Berger, Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> On 11/29/21 07:50, James Bottomley wrote:
> > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
> > > > Currently we get one entry in the IMA log per unique file
> > > > event.  So, if you have a measurement policy and it measures a
> > > > particular binary it will not get measured again if it is
> > > > subsequently executed. For Namespaced IMA, the correct
> > > > behaviour
> > > > seems to be to log once per inode per namespace (so every
> > > > unique
> > > > execution in a namespace gets a separate log entry).  Since
> > > > logging
> > > > once per inode per namespace is
> > > I suspect I'll need to do a more in depth reading of the existing
> > > code, but I'll ask the lazy question anyway (since you say "the
> > > correct behavior seems to be") - is it actually important that
> > > files which were appraised under a parent namespace's policy
> > > already
> > > should be logged again?
> > I think so.  For a couple of reasons, assuming the namespace
> > eventually
> > gets its own log entries, which the next incremental patch proposed
> > to
> > do by virtualizing the securityfs entries.  If you don't do this:
> 
> To avoid duplicate efforts, an implementation of a virtualized 
> securityfs is in this series here:
> 
> https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> 
> It starts with 'securityfs: Prefix global variables with secruityfs_'

That's quite a big patch series.  I already actually implemented this
as part of the RFC for getting the per namespace measurement log.  The
attached is basically what I did.

Most of the time we don't require namespacing the actual virtualfs
file, because it's world readable.  IMA has a special requirement in
this regard because the IMA files should be readable (and writeable
when we get around to policy updates) by the admin of the namespace but
their protection is 0640 or 0440.  I thought the simplest solution
would be to make an additional flag that coped with the permissions and
a per-inode flag way of making the file as "accessible by userns
admin".  Doing something simple like this gives a much smaller
diffstat:

 fs/stat.c                |    8 ++++++++
 include/linux/fs.h       |    1 +
 include/linux/security.h |    2 +-
 kernel/capability.c      |    6 ++++--
 security/inode.c         |    6 ++++--
 5 files changed, 18 insertions(+), 5 deletions(-)

The full diff is below so you can see how I did it.  There's a sort of
serendipity bit where it turns out the next i_flag entry is at bit 17
meaning it can be overloaded on the mode which is a u16, but other than
that I think it's pretty clean.

James

---

diff --git a/fs/stat.c b/fs/stat.c
index 28d2020ba1f4..2025dfea6720 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -49,6 +49,14 @@ void generic_fillattr(struct user_namespace *mnt_userns, struct inode *inode,
 	stat->nlink = inode->i_nlink;
 	stat->uid = i_uid_into_mnt(mnt_userns, inode);
 	stat->gid = i_gid_into_mnt(mnt_userns, inode);
+	if (inode->i_flags & S_NSUSER) {
+		struct user_namespace *ns = current_user_ns();
+
+		if (uid_eq(stat->uid, GLOBAL_ROOT_UID))
+			stat->uid = ns->owner;
+		if (gid_eq(stat->gid, GLOBAL_ROOT_GID))
+			stat->gid = ns->group;
+	}
 	stat->rdev = inode->i_rdev;
 	stat->size = i_size_read(inode);
 	stat->atime = inode->i_atime;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1cb616fc1105..8bebac8463ac 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2249,6 +2249,7 @@ struct super_operations {
 #define S_ENCRYPTED	(1 << 14) /* Encrypted file (using fs/crypto/) */
 #define S_CASEFOLD	(1 << 15) /* Casefolded file */
 #define S_VERITY	(1 << 16) /* Verity file (using fs/verity/) */
+#define S_NSUSER	(1 << 17) /* uid/gid changes with user_ns */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system
diff --git a/include/linux/security.h b/include/linux/security.h
index bbf44a466832..1cfe8832f5e0 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1919,7 +1919,7 @@ static inline void security_audit_rule_free(void *lsmrule)
 
 #ifdef CONFIG_SECURITYFS
 
-extern struct dentry *securityfs_create_file(const char *name, umode_t mode,
+extern struct dentry *securityfs_create_file(const char *name, unsigned int mode,
 					     struct dentry *parent, void *data,
 					     const struct file_operations *fops);
 extern struct dentry *securityfs_create_dir(const char *name, struct dentry *parent);
diff --git a/kernel/capability.c b/kernel/capability.c
index 46a361dde042..54907ffa4947 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -506,8 +506,10 @@ bool capable_wrt_inode_uidgid(struct user_namespace *mnt_userns,
 {
 	struct user_namespace *ns = current_user_ns();
 
-	return ns_capable(ns, cap) &&
-	       privileged_wrt_inode_uidgid(ns, mnt_userns, inode);
+	if (ns_capable(ns, cap) &&
+	    privileged_wrt_inode_uidgid(ns, mnt_userns, inode))
+		return true;
+	return (inode->i_flags & S_NSUSER) && ns_capable(ns, cap);
 }
 EXPORT_SYMBOL(capable_wrt_inode_uidgid);
 
diff --git a/security/inode.c b/security/inode.c
index 6c326939750d..917514ddfce4 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -104,7 +104,7 @@ static struct file_system_type fs_type = {
  * If securityfs is not enabled in the kernel, the value %-ENODEV is
  * returned.
  */
-static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
+static struct dentry *securityfs_create_dentry(const char *name, unsigned int mode,
 					struct dentry *parent, void *data,
 					const struct file_operations *fops,
 					const struct inode_operations *iops)
@@ -112,6 +112,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 	struct dentry *dentry;
 	struct inode *dir, *inode;
 	int error;
+	unsigned int flags = mode & 0xffff0000;
 
 	if (!(mode & S_IFMT))
 		mode = (mode & S_IALLUGO) | S_IFREG;
@@ -147,6 +148,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
 	inode->i_mode = mode;
 	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
 	inode->i_private = data;
+	inode->i_flags = flags;
 	if (S_ISDIR(mode)) {
 		inode->i_op = &simple_dir_inode_operations;
 		inode->i_fop = &simple_dir_operations;
@@ -197,7 +199,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
  * If securityfs is not enabled in the kernel, the value %-ENODEV is
  * returned.
  */
-struct dentry *securityfs_create_file(const char *name, umode_t mode,
+struct dentry *securityfs_create_file(const char *name, unsigned int mode,
 				      struct dentry *parent, void *data,
 				      const struct file_operations *fops)
 {


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-29 13:56                       ` Christian Brauner
@ 2021-11-29 14:19                         ` Stefan Berger
  2021-11-30 13:09                         ` James Bottomley
  1 sibling, 0 replies; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 14:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Serge E. Hallyn, James Bottomley, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner


On 11/29/21 08:56, Christian Brauner wrote:
> On Mon, Nov 29, 2021 at 08:49:40AM -0500, Stefan Berger wrote:
>> On 11/28/21 20:59, Serge E. Hallyn wrote:
>>> On Sun, Nov 28, 2021 at 05:56:28PM -0500, James Bottomley wrote:
>>>> On Sun, 2021-11-28 at 15:49 -0600, Serge E. Hallyn wrote:
>>>>> On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley wrote:
>>>>>> On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
>>>>>>> On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley wrote:
>>>>>>>> On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn wrote:
>>>> [...]
>>>>>>>>> So given that 'unique per boot' is sufficient, what would be
>>>>>>>>> the problem with simply adding a simple ever-increasing
>>>>>>>>> unique atomix count to the struct user_namespace?
>>>>>>>> I don't think there is any ... but I equally don't see why
>>>>>>>> people would want to save and restore the uuid but not the new
>>>>>>>> monotonic identifier ... because it's still just a marker on a
>>>>>>>> namespace.
>>>>>>> But you've called it "the namespace uuid".  I'm not even really
>>>>>>> thinking of checkpoint/restart, just stopping and restarting a
>>>>>>> container.  I'm convinced people will want to start using it
>>>>>>> because, well, it is a nice feature.
>>>>>> Right, but the uniqueness property depends on you not being able to
>>>>>> set it.  If you just want a namespace label, you can have that, but
>>>>>> anything a user can set is either a pain to guarantee uniqueness
>>>>>> (have to check all the other objects) or is simply a non-unique
>>>>>> label.
>>>>>>
>>>>>> If you want to label a container, which could have many namespaces
>>>>>> and be stopped and restarted many times, it does sound like you
>>>>>> want a non-unique settable label.  However, IMA definitely needs a
>>>>>> guaranteed per namespace unique label.
>>>>>>
>>>>>> Is the objection simply you think a UUID sound like it should be
>>>>> Objection is too strong.  Concern.
>>>>>
>>>>> But yes, to me a uuid (a) feels like it should be generally useful
>>>>> including being settable and (b) not super duper 100% absolutely
>>>>> guaranteed to always be unique per boot, as an incremented counter
>>>>> would be.
>>>> OK, but a bunch of cats I found on the Internet agree with me, a UUID
>>>> shouldn't be settable:
>>>>
>>>> https://en.wikipedia.org/wiki/Universally_unique_identifier
>>>>
>>>> The key point being, if you can set the id, it can't be unique ... it
>>> Ok, so can you just put a comment above there saying "this must not
>>> be settable from userspace" ?
>> So I have been working on an IMA namespacing series again as well and would
>> like to use some sort of unique identifier for audit messages emitted from
>> an IMA/user namespace other than the init_ima_ns. This UUID may just work
>> for this, but how would one associate the UUID with a container if it ever
>> comes to that when evaluating audit logs? Shouldn't it be settable from user
>> space for some sort of 'coordination' between container runtime and kernel?
> Wouldn't this be solved by the audit container id patchset? In fact,
> can't we use this for IMA as well?

I suppose it's this yet-to-be-merged series you are referring to: 
https://lkml.org/lkml/2021/1/12/818

It would work for as long as it's *required* to set the identifier or 
the identifier is readily available when the first audit message needs 
to be emitted. If this is not the case then that unique identifier 
should maybe originate in the kernel and be queryable by user space but 
then again the same container may come up with different identifiers 
every time. I am not sure whether that is desirable. Some sort of 'base 
UUID' could be passed with the clone3() call and either all namespaces 
could derive from the base UUID or just get the same value. If that's 
not set the kernel gets to choose the base UUID... At least by updating 
container runtimes the coordination issue could be solved and with the 
clone3() call the early availability of the UUID could be guranteed?



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:10         ` James Bottomley
@ 2021-11-29 14:22           ` Christian Brauner
  2021-11-29 14:46             ` James Bottomley
  2021-11-29 14:30           ` Stefan Berger
  1 sibling, 1 reply; 46+ messages in thread
From: Christian Brauner @ 2021-11-29 14:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Berger, Serge E. Hallyn, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > On 11/29/21 07:50, James Bottomley wrote:
> > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
> > > > > Currently we get one entry in the IMA log per unique file
> > > > > event.  So, if you have a measurement policy and it measures a
> > > > > particular binary it will not get measured again if it is
> > > > > subsequently executed. For Namespaced IMA, the correct
> > > > > behaviour
> > > > > seems to be to log once per inode per namespace (so every
> > > > > unique
> > > > > execution in a namespace gets a separate log entry).  Since
> > > > > logging
> > > > > once per inode per namespace is
> > > > I suspect I'll need to do a more in depth reading of the existing
> > > > code, but I'll ask the lazy question anyway (since you say "the
> > > > correct behavior seems to be") - is it actually important that
> > > > files which were appraised under a parent namespace's policy
> > > > already
> > > > should be logged again?
> > > I think so.  For a couple of reasons, assuming the namespace
> > > eventually
> > > gets its own log entries, which the next incremental patch proposed
> > > to
> > > do by virtualizing the securityfs entries.  If you don't do this:
> > 
> > To avoid duplicate efforts, an implementation of a virtualized 
> > securityfs is in this series here:
> > 
> > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > 
> > It starts with 'securityfs: Prefix global variables with secruityfs_'
> 
> That's quite a big patch series.  I already actually implemented this
> as part of the RFC for getting the per namespace measurement log.  The
> attached is basically what I did.
> 
> Most of the time we don't require namespacing the actual virtualfs
> file, because it's world readable.  IMA has a special requirement in
> this regard because the IMA files should be readable (and writeable
> when we get around to policy updates) by the admin of the namespace but
> their protection is 0640 or 0440.  I thought the simplest solution
> would be to make an additional flag that coped with the permissions and
> a per-inode flag way of making the file as "accessible by userns
> admin".  Doing something simple like this gives a much smaller
> diffstat:

That's a NAK from me. Stefan's series might be bigger but it does things
correctly. I appreciate the keep it simple attitude but no. I won't
speciale-case securityfs or similar stuff in core vfs helpers.

Christian

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:10         ` James Bottomley
  2021-11-29 14:22           ` Christian Brauner
@ 2021-11-29 14:30           ` Stefan Berger
  2021-11-29 15:08             ` James Bottomley
  2021-11-29 16:20             ` Christian Brauner
  1 sibling, 2 replies; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 14:30 UTC (permalink / raw)
  To: James Bottomley, Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner


On 11/29/21 09:10, James Bottomley wrote:
> On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
>> On 11/29/21 07:50, James Bottomley wrote:
>>> On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
>>>> On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
>>>>> Currently we get one entry in the IMA log per unique file
>>>>> event.  So, if you have a measurement policy and it measures a
>>>>> particular binary it will not get measured again if it is
>>>>> subsequently executed. For Namespaced IMA, the correct
>>>>> behaviour
>>>>> seems to be to log once per inode per namespace (so every
>>>>> unique
>>>>> execution in a namespace gets a separate log entry).  Since
>>>>> logging
>>>>> once per inode per namespace is
>>>> I suspect I'll need to do a more in depth reading of the existing
>>>> code, but I'll ask the lazy question anyway (since you say "the
>>>> correct behavior seems to be") - is it actually important that
>>>> files which were appraised under a parent namespace's policy
>>>> already
>>>> should be logged again?
>>> I think so.  For a couple of reasons, assuming the namespace
>>> eventually
>>> gets its own log entries, which the next incremental patch proposed
>>> to
>>> do by virtualizing the securityfs entries.  If you don't do this:
>> To avoid duplicate efforts, an implementation of a virtualized
>> securityfs is in this series here:
>>
>> https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
>>
>> It starts with 'securityfs: Prefix global variables with secruityfs_'
> That's quite a big patch series.  I already actually implemented this
> as part of the RFC for getting the per namespace measurement log.  The
> attached is basically what I did.

I know it's big. I tried to avoid having to bind-mount the system-wide 
single securityfs into the container and inherit all the other security 
subsystems' files and directories (evm, TPM, safesetid, apparmor, tomoyo 
[ https://elixir.bootlin.com/linux/latest/C/ident/securityfs_create_dir 
]) and instead have a  'view' that is a bit more restricted to those 
subsystems that are namespaced. The securityfs_ns I created can be 
mounted into each user namespace individually and only shows what you're 
supposed to see without other filesystem tricks to hide files or so. It 
should be future-extensible for other subsystem to register themselves 
there if they have something to show to the user.

    Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:22           ` Christian Brauner
@ 2021-11-29 14:46             ` James Bottomley
  2021-11-29 15:27               ` Stefan Berger
  2021-11-29 15:35               ` Serge E. Hallyn
  0 siblings, 2 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-29 14:46 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Stefan Berger, Serge E. Hallyn, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > On 11/29/21 07:50, James Bottomley wrote:
> > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
> > > > > wrote:
> > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > event.  So, if you have a measurement policy and it
> > > > > > measures a particular binary it will not get measured again
> > > > > > if it is subsequently executed. For Namespaced IMA, the
> > > > > > correct behaviour seems to be to log once per inode per
> > > > > > namespace (so every unique execution in a namespace gets a
> > > > > > separate log entry).  Since logging once per inode per
> > > > > > namespace is
> > > > > I suspect I'll need to do a more in depth reading of the
> > > > > existing code, but I'll ask the lazy question anyway (since
> > > > > you say "the correct behavior seems to be") - is it actually
> > > > > important that files which were appraised under a parent
> > > > > namespace's policy already should be logged again?
> > > > I think so.  For a couple of reasons, assuming the namespace
> > > > eventually gets its own log entries, which the next incremental
> > > > patch proposed to do by virtualizing the securityfs
> > > > entries.  If you don't do this:
> > > 
> > > To avoid duplicate efforts, an implementation of a virtualized 
> > > securityfs is in this series here:
> > > 
> > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > 
> > > It starts with 'securityfs: Prefix global variables with
> > > secruityfs_'
> > 
> > That's quite a big patch series.  I already actually implemented
> > this as part of the RFC for getting the per namespace measurement
> > log.  The attached is basically what I did.
> > 
> > Most of the time we don't require namespacing the actual virtualfs
> > file, because it's world readable.  IMA has a special requirement
> > in this regard because the IMA files should be readable (and
> > writeable when we get around to policy updates) by the admin of the
> > namespace but their protection is 0640 or 0440.  I thought the
> > simplest solution would be to make an additional flag that coped
> > with the permissions and a per-inode flag way of making the file as
> > "accessible by userns admin".  Doing something simple like this
> > gives a much smaller diffstat:
> 
> That's a NAK from me. Stefan's series might be bigger but it does
> things correctly. I appreciate the keep it simple attitude but no. I
> won't speciale-case securityfs or similar stuff in core vfs helpers.

Well, there's a reason it's an unpublished patch.  However, the more
important point is that namespacing IMA requires discussion of certain
points that we never seem to drive to a conclusion.  Using the akpm
method, I propose simple patches that drive the discussion.  I think
the points are:

   1. Should IMA be its own namespace or tied to the user namespace?  The
      previous patches all took the separate Namespace approach, but I
      think that should be reconsidered now keyrings are in the user
      namespace.
   2. How do we get a unique id for the IMA namespace to use in the log?
   3. how should we virtualize securityfs for IMA given the need of the
      namespace admin to read and write the IMA files?

And, of course, the fun ones we're coming to.

   1. Given that the current keyring namespacing doesn't give access to
      the system keyrings, how do we get per-namespace access for
      .ima/_ima system keyrings given that the namespace admin is going to
      want to set their own key for appraisal?
   2. What mechanism should we use for .ima/_ima key setting?  The current
      mechanism is must be signed by a key in the system keyrings sounds
      appropriate, but is problematic given most system owners don't
      actually have any private keys for keys in the system keyrings. 
      Hopefully the MoK keyring patches will help us have an easier
      approach to this.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:30           ` Stefan Berger
@ 2021-11-29 15:08             ` James Bottomley
  2021-11-29 16:20             ` Christian Brauner
  1 sibling, 0 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-29 15:08 UTC (permalink / raw)
  To: Stefan Berger, Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, 2021-11-29 at 09:30 -0500, Stefan Berger wrote:
> On 11/29/21 09:10, James Bottomley wrote:
> > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > On 11/29/21 07:50, James Bottomley wrote:
> > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
> > > > > wrote:
> > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > event.  So, if you have a measurement policy and it
> > > > > > measures a particular binary it will not get measured again
> > > > > > if it is subsequently executed. For Namespaced IMA, the
> > > > > > correct behaviour seems to be to log once per inode per
> > > > > > namespace (so every unique execution in a namespace gets a
> > > > > > separate log entry).  Since logging once per inode per
> > > > > > namespace is
> > > > > I suspect I'll need to do a more in depth reading of the
> > > > > existing code, but I'll ask the lazy question anyway (since
> > > > > you say "the correct behavior seems to be") - is it actually
> > > > > important that files which were appraised under a parent
> > > > > namespace's policy already should be logged again?
> > > > I think so.  For a couple of reasons, assuming the namespace
> > > > eventually gets its own log entries, which the next incremental
> > > > patch proposed to do by virtualizing the securityfs
> > > > entries.  If you don't do this:
> > > To avoid duplicate efforts, an implementation of a virtualized
> > > securityfs is in this series here:
> > > 
> > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > 
> > > It starts with 'securityfs: Prefix global variables with
> > > secruityfs_' 

> > That's quite a big patch series.  I already actually implemented
> > this as part of the RFC for getting the per namespace measurement
> > log.  The attached is basically what I did.
> 
> I know it's big. I tried to avoid having to bind-mount the system-
> wide single securityfs into the container and inherit all the other
> security subsystems' files and directories (evm, TPM, safesetid,
> apparmor, tomoyo  [ 
> https://elixir.bootlin.com/linux/latest/C/ident/securityfs_create_dir
>  
> ]) and instead have a  'view' that is a bit more restricted to those 
> subsystems that are namespaced. The securityfs_ns I created can be 
> mounted into each user namespace individually and only shows what
> you're supposed to see without other filesystem tricks to hide files
> or so. It should be future-extensible for other subsystem to register
> themselves there if they have something to show to the user.

Using F_USER_NS for this is certainly what it was designed for.  I
don't think size is a problem as long as it's right sized to perform
the required function.  I usually find it easier to oversimplify and
work up, but that's certainly not the only approach.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:46             ` James Bottomley
@ 2021-11-29 15:27               ` Stefan Berger
  2021-11-29 16:23                 ` James Bottomley
  2021-11-29 15:35               ` Serge E. Hallyn
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 15:27 UTC (permalink / raw)
  To: James Bottomley, Christian Brauner
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner


On 11/29/21 09:46, James Bottomley wrote:
> On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
>> On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
>>> On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
>>>> On 11/29/21 07:50, James Bottomley wrote:
>>>>> On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
>>>>>> On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
>>>>>> wrote:
>>>>>>> Currently we get one entry in the IMA log per unique file
>>>>>>> event.  So, if you have a measurement policy and it
>>>>>>> measures a particular binary it will not get measured again
>>>>>>> if it is subsequently executed. For Namespaced IMA, the
>>>>>>> correct behaviour seems to be to log once per inode per
>>>>>>> namespace (so every unique execution in a namespace gets a
>>>>>>> separate log entry).  Since logging once per inode per
>>>>>>> namespace is
>>>>>> I suspect I'll need to do a more in depth reading of the
>>>>>> existing code, but I'll ask the lazy question anyway (since
>>>>>> you say "the correct behavior seems to be") - is it actually
>>>>>> important that files which were appraised under a parent
>>>>>> namespace's policy already should be logged again?
>>>>> I think so.  For a couple of reasons, assuming the namespace
>>>>> eventually gets its own log entries, which the next incremental
>>>>> patch proposed to do by virtualizing the securityfs
>>>>> entries.  If you don't do this:
>>>> To avoid duplicate efforts, an implementation of a virtualized
>>>> securityfs is in this series here:
>>>>
>>>> https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
>>>>
>>>> It starts with 'securityfs: Prefix global variables with
>>>> secruityfs_'
>>> That's quite a big patch series.  I already actually implemented
>>> this as part of the RFC for getting the per namespace measurement
>>> log.  The attached is basically what I did.
>>>
>>> Most of the time we don't require namespacing the actual virtualfs
>>> file, because it's world readable.  IMA has a special requirement
>>> in this regard because the IMA files should be readable (and
>>> writeable when we get around to policy updates) by the admin of the
>>> namespace but their protection is 0640 or 0440.  I thought the
>>> simplest solution would be to make an additional flag that coped
>>> with the permissions and a per-inode flag way of making the file as
>>> "accessible by userns admin".  Doing something simple like this
>>> gives a much smaller diffstat:
>> That's a NAK from me. Stefan's series might be bigger but it does
>> things correctly. I appreciate the keep it simple attitude but no. I
>> won't speciale-case securityfs or similar stuff in core vfs helpers.
> Well, there's a reason it's an unpublished patch.  However, the more
> important point is that namespacing IMA requires discussion of certain
> points that we never seem to drive to a conclusion.  Using the akpm
> method, I propose simple patches that drive the discussion.  I think
> the points are:
[...]
>
>
> And, of course, the fun ones we're coming to.
>
>     1. Given that the current keyring namespacing doesn't give access to
>        the system keyrings, how do we get per-namespace access for
>        .ima/_ima system keyrings given that the namespace admin is going to
>        want to set their own key for appraisal?
>     2. What mechanism should we use for .ima/_ima key setting?  The current
>        mechanism is must be signed by a key in the system keyrings sounds
>        appropriate, but is problematic given most system owners don't
>        actually have any private keys for keys in the system keyrings.
>        Hopefully the MoK keyring patches will help us have an easier
>        approach to this.

The approach we took in the previous implementation was to support BYOK 
(bring your own key) for every container. The (trusted) container 
runtime has to do what dracut would typically do, namely create the 
keyrings and load the keys onto it.

The container runtime would

1a) expect keys to be found inside a container's filesystem at the usual 
location (/etc/keys/ima) but then also allow for a CA key that is used 
to verify the signature of those keys; that CA key is typically baked 
into the Linux kernel when .ima is to be used, but for containers and 
BYOK it's an additional file in the container's filesystem

1b) passing in keys via command line should be possible as well but 
that's an implementation detail

2) container runtime sets up either a restricted keyring [ 
https://elixir.bootlin.com/linux/latest/source/Documentation/crypto/asymmetric-keys.rst#L359 
] if that CA key is found in the filesystem or a 'normal' keyring. The 
container runtime then loads the keys onto that keyring; call that 
keyring '.ima' or '_ima' for as long as the kernel knows what keyring to 
search for. We created that keyring under a session keyring. With the 
user namespace isolation and keyrings support in the user namespace the 
isolation of the IMA related keyrings between different user namespaces 
should be possible.


The same would be done for the IMA policy where the container runtime 
also needs to do some work that usually dracut would do:

- expect the IMA policy for the container at the usual location 
(/etc/ima/ima-policy) and load it into the container's 'securityfs' 
policy file

    Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:46             ` James Bottomley
  2021-11-29 15:27               ` Stefan Berger
@ 2021-11-29 15:35               ` Serge E. Hallyn
  2021-11-29 16:07                 ` Stefan Berger
                                   ` (2 more replies)
  1 sibling, 3 replies; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-29 15:35 UTC (permalink / raw)
  To: James Bottomley
  Cc: Christian Brauner, Stefan Berger, Serge E. Hallyn,
	linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > > On 11/29/21 07:50, James Bottomley wrote:
> > > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
> > > > > > wrote:
> > > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > > event.  So, if you have a measurement policy and it
> > > > > > > measures a particular binary it will not get measured again
> > > > > > > if it is subsequently executed. For Namespaced IMA, the
> > > > > > > correct behaviour seems to be to log once per inode per
> > > > > > > namespace (so every unique execution in a namespace gets a
> > > > > > > separate log entry).  Since logging once per inode per
> > > > > > > namespace is
> > > > > > I suspect I'll need to do a more in depth reading of the
> > > > > > existing code, but I'll ask the lazy question anyway (since
> > > > > > you say "the correct behavior seems to be") - is it actually
> > > > > > important that files which were appraised under a parent
> > > > > > namespace's policy already should be logged again?
> > > > > I think so.  For a couple of reasons, assuming the namespace
> > > > > eventually gets its own log entries, which the next incremental
> > > > > patch proposed to do by virtualizing the securityfs
> > > > > entries.  If you don't do this:
> > > > 
> > > > To avoid duplicate efforts, an implementation of a virtualized 
> > > > securityfs is in this series here:
> > > > 
> > > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > > 
> > > > It starts with 'securityfs: Prefix global variables with
> > > > secruityfs_'
> > > 
> > > That's quite a big patch series.  I already actually implemented
> > > this as part of the RFC for getting the per namespace measurement
> > > log.  The attached is basically what I did.
> > > 
> > > Most of the time we don't require namespacing the actual virtualfs
> > > file, because it's world readable.  IMA has a special requirement
> > > in this regard because the IMA files should be readable (and
> > > writeable when we get around to policy updates) by the admin of the
> > > namespace but their protection is 0640 or 0440.  I thought the
> > > simplest solution would be to make an additional flag that coped
> > > with the permissions and a per-inode flag way of making the file as
> > > "accessible by userns admin".  Doing something simple like this
> > > gives a much smaller diffstat:
> > 
> > That's a NAK from me. Stefan's series might be bigger but it does
> > things correctly. I appreciate the keep it simple attitude but no. I
> > won't speciale-case securityfs or similar stuff in core vfs helpers.
> 
> Well, there's a reason it's an unpublished patch.  However, the more
> important point is that namespacing IMA requires discussion of certain
> points that we never seem to drive to a conclusion.  Using the akpm
> method, I propose simple patches that drive the discussion.  I think
> the points are:
> 
>    1. Should IMA be its own namespace or tied to the user namespace?  The
>       previous patches all took the separate Namespace approach, but I
>       think that should be reconsidered now keyrings are in the user
>       namespace.

Well that purely depends on the needed scope.

The audit container identifier is a neat thing.  But it absolutely must
be settable, so seems to conflict with your needs.

Your patch puts an identifier on the user_namespace.  I'm not quite sure,
does that satisfy Stefan's needs?  A new ima ns if and only if there is a
new user ns?

I think you two need to get together and discuss the requirements, and come
back with a brief but very precise document explaining what you need.

Are you both looking at the same use case?  Who is consuming the audit
log, and to what end?  Container administrators?  Any time they log in?
How do they assure themselves that the securityfs file they're reading
hasn't been overmounted?

I need to find a document to read about IMA's usage of PCRs.  For
namespacing, are you expecting each container to be hooked up to a
swtmp instance so they have their own PCR they can use?

>    2. How do we get a unique id for the IMA namespace to use in the log?
>    3. how should we virtualize securityfs for IMA given the need of the
>       namespace admin to read and write the IMA files?
> 
> And, of course, the fun ones we're coming to.
> 
>    1. Given that the current keyring namespacing doesn't give access to
>       the system keyrings, how do we get per-namespace access for
>       .ima/_ima system keyrings given that the namespace admin is going to
>       want to set their own key for appraisal?
>    2. What mechanism should we use for .ima/_ima key setting?  The current
>       mechanism is must be signed by a key in the system keyrings sounds
>       appropriate, but is problematic given most system owners don't
>       actually have any private keys for keys in the system keyrings. 
>       Hopefully the MoK keyring patches will help us have an easier
>       approach to this.
> 
> James
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 15:35               ` Serge E. Hallyn
@ 2021-11-29 16:07                 ` Stefan Berger
  2021-11-30  4:42                   ` Serge E. Hallyn
  2021-11-29 16:16                 ` Christian Brauner
  2021-11-29 16:44                 ` James Bottomley
  2 siblings, 1 reply; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 16:07 UTC (permalink / raw)
  To: Serge E. Hallyn, James Bottomley
  Cc: Christian Brauner, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner


On 11/29/21 10:35, Serge E. Hallyn wrote:
> On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
>> On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
>>> On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
>>>> On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
>>>>> On 11/29/21 07:50, James Bottomley wrote:
>>>>>> On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
>>>>>>> On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
>>>>>>> wrote:
>>>>>>>> Currently we get one entry in the IMA log per unique file
>>>>>>>> event.  So, if you have a measurement policy and it
>>>>>>>> measures a particular binary it will not get measured again
>>>>>>>> if it is subsequently executed. For Namespaced IMA, the
>>>>>>>> correct behaviour seems to be to log once per inode per
>>>>>>>> namespace (so every unique execution in a namespace gets a
>>>>>>>> separate log entry).  Since logging once per inode per
>>>>>>>> namespace is
>>>>>>> I suspect I'll need to do a more in depth reading of the
>>>>>>> existing code, but I'll ask the lazy question anyway (since
>>>>>>> you say "the correct behavior seems to be") - is it actually
>>>>>>> important that files which were appraised under a parent
>>>>>>> namespace's policy already should be logged again?
>>>>>> I think so.  For a couple of reasons, assuming the namespace
>>>>>> eventually gets its own log entries, which the next incremental
>>>>>> patch proposed to do by virtualizing the securityfs
>>>>>> entries.  If you don't do this:
>>>>> To avoid duplicate efforts, an implementation of a virtualized
>>>>> securityfs is in this series here:
>>>>>
>>>>> https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
>>>>>
>>>>> It starts with 'securityfs: Prefix global variables with
>>>>> secruityfs_'
>>>> That's quite a big patch series.  I already actually implemented
>>>> this as part of the RFC for getting the per namespace measurement
>>>> log.  The attached is basically what I did.
>>>>
>>>> Most of the time we don't require namespacing the actual virtualfs
>>>> file, because it's world readable.  IMA has a special requirement
>>>> in this regard because the IMA files should be readable (and
>>>> writeable when we get around to policy updates) by the admin of the
>>>> namespace but their protection is 0640 or 0440.  I thought the
>>>> simplest solution would be to make an additional flag that coped
>>>> with the permissions and a per-inode flag way of making the file as
>>>> "accessible by userns admin".  Doing something simple like this
>>>> gives a much smaller diffstat:
>>> That's a NAK from me. Stefan's series might be bigger but it does
>>> things correctly. I appreciate the keep it simple attitude but no. I
>>> won't speciale-case securityfs or similar stuff in core vfs helpers.
>> Well, there's a reason it's an unpublished patch.  However, the more
>> important point is that namespacing IMA requires discussion of certain
>> points that we never seem to drive to a conclusion.  Using the akpm
>> method, I propose simple patches that drive the discussion.  I think
>> the points are:
>>
>>     1. Should IMA be its own namespace or tied to the user namespace?  The
>>        previous patches all took the separate Namespace approach, but I
>>        think that should be reconsidered now keyrings are in the user
>>        namespace.
> Well that purely depends on the needed scope.
>
> The audit container identifier is a neat thing.  But it absolutely must
> be settable, so seems to conflict with your needs.
>
> Your patch puts an identifier on the user_namespace.  I'm not quite sure,
> does that satisfy Stefan's needs?  A new ima ns if and only if there is a
> new user ns?
>
> I think you two need to get together and discuss the requirements, and come
> back with a brief but very precise document explaining what you need.

What would those want who look at audit messages? [Idk] Would they want 
a constant identifier for IMA audit messages in the audit log across all 
restarts of a container? Presumably that would make quick queries across 
restarts much easier. Or could they live with an audit message emitted 
from the container runtime indicating that this time the (IMA) audit 
messages from this container will have this UUID here?

I guess both would 'work.'

>
> Are you both looking at the same use case?  Who is consuming the audit
> log, and to what end?  Container administrators?  Any time they log in?
> How do they assure themselves that the securityfs file they're reading
> hasn't been overmounted?

The question is also should there only be one identifier or can there be 
two different one (one from audit patch series and uuid of user namespace).


>
> I need to find a document to read about IMA's usage of PCRs.  For
> namespacing, are you expecting each container to be hooked up to a
> swtmp instance so they have their own PCR they can use?

It's complicated and there's a bit more to this... I would try to 
architect it in a way that the IMA system policy can cover what's going 
on inside IMA namespaces, i.e., audit and measure and appraise file 
accesses occurring in those namespace. We call it hierarchical 
processing ( 
https://github.com/stefanberger/linux-ima-namespaces/commit/e88dc84ec97753fd65d302ee1bf03951001ab48f 
) where file access are evaluated against the current namespace's policy 
and then also evaluated against those of parent namespaces back to the 
init_ima_ns. The goal is to avoid evasion of measurements etc. by the 
user just by spawning new IMA namespaces. I think logging into the IMA 
system log will not scale well if there are hundreds of containers on 
the system using IMA and logging into the system log and hammering the 
TPM. So, the answer then is write your policy in such a way that it 
doesn't cover the IMA/user namespaces (containers) and have each 
container have its own IMA policy and IMA log and and an optional vTPM. 
So my answer would be 'optional swtpm.'

    Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 15:35               ` Serge E. Hallyn
  2021-11-29 16:07                 ` Stefan Berger
@ 2021-11-29 16:16                 ` Christian Brauner
  2021-11-29 16:23                   ` Christian Brauner
  2021-11-29 17:04                   ` Stefan Berger
  2021-11-29 16:44                 ` James Bottomley
  2 siblings, 2 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-29 16:16 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: James Bottomley, Stefan Berger, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
> On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > > > On 11/29/21 07:50, James Bottomley wrote:
> > > > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
> > > > > > > wrote:
> > > > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > > > event.  So, if you have a measurement policy and it
> > > > > > > > measures a particular binary it will not get measured again
> > > > > > > > if it is subsequently executed. For Namespaced IMA, the
> > > > > > > > correct behaviour seems to be to log once per inode per
> > > > > > > > namespace (so every unique execution in a namespace gets a
> > > > > > > > separate log entry).  Since logging once per inode per
> > > > > > > > namespace is
> > > > > > > I suspect I'll need to do a more in depth reading of the
> > > > > > > existing code, but I'll ask the lazy question anyway (since
> > > > > > > you say "the correct behavior seems to be") - is it actually
> > > > > > > important that files which were appraised under a parent
> > > > > > > namespace's policy already should be logged again?
> > > > > > I think so.  For a couple of reasons, assuming the namespace
> > > > > > eventually gets its own log entries, which the next incremental
> > > > > > patch proposed to do by virtualizing the securityfs
> > > > > > entries.  If you don't do this:
> > > > > 
> > > > > To avoid duplicate efforts, an implementation of a virtualized 
> > > > > securityfs is in this series here:
> > > > > 
> > > > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > > > 
> > > > > It starts with 'securityfs: Prefix global variables with
> > > > > secruityfs_'
> > > > 
> > > > That's quite a big patch series.  I already actually implemented
> > > > this as part of the RFC for getting the per namespace measurement
> > > > log.  The attached is basically what I did.
> > > > 
> > > > Most of the time we don't require namespacing the actual virtualfs
> > > > file, because it's world readable.  IMA has a special requirement
> > > > in this regard because the IMA files should be readable (and
> > > > writeable when we get around to policy updates) by the admin of the
> > > > namespace but their protection is 0640 or 0440.  I thought the
> > > > simplest solution would be to make an additional flag that coped
> > > > with the permissions and a per-inode flag way of making the file as
> > > > "accessible by userns admin".  Doing something simple like this
> > > > gives a much smaller diffstat:
> > > 
> > > That's a NAK from me. Stefan's series might be bigger but it does
> > > things correctly. I appreciate the keep it simple attitude but no. I
> > > won't speciale-case securityfs or similar stuff in core vfs helpers.
> > 
> > Well, there's a reason it's an unpublished patch.  However, the more
> > important point is that namespacing IMA requires discussion of certain
> > points that we never seem to drive to a conclusion.  Using the akpm
> > method, I propose simple patches that drive the discussion.  I think
> > the points are:
> > 
> >    1. Should IMA be its own namespace or tied to the user namespace?  The
> >       previous patches all took the separate Namespace approach, but I
> >       think that should be reconsidered now keyrings are in the user
> >       namespace.
> 
> Well that purely depends on the needed scope.
> 
> The audit container identifier is a neat thing.  But it absolutely must
> be settable, so seems to conflict with your needs.
> 
> Your patch puts an identifier on the user_namespace.  I'm not quite sure,
> does that satisfy Stefan's needs?  A new ima ns if and only if there is a
> new user ns?

I kept thinking about this question while I was out running and while I
admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
to me that this is the right approach after all. Making it part of
userns at least in this form isn't clean.

I think attaching a uuid to a userns alone for the sake of IMA is wrong.
Additionally, I think a uuid only for the userns is too limited. This is
similar to the problem of the audit container id. If we have some sort
of uuid for ima it will automatically evolve into something like a
container id (I'm not even arguing that this is necessarily wrong.).
We also have the issue that we then have the container audit id thing -
if this ever lands and the ima userns uuid. All that makes it quite
messy.

I think CLONE_NEWIMA is ultimately nicer and allows the implementation
to be decoupled from the userns and self-contained as possible. This
also means that ima ns works for privileged containers which sure is a
valid use-case.
It will also make securityfs namespacing easier as it can be a keyed
super where the key is the ima ns (similar to how we deal with e.g.
mqueue).

> 
> I think you two need to get together and discuss the requirements, and come
> back with a brief but very precise document explaining what you need.

Good idea.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 14:30           ` Stefan Berger
  2021-11-29 15:08             ` James Bottomley
@ 2021-11-29 16:20             ` Christian Brauner
  1 sibling, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-29 16:20 UTC (permalink / raw)
  To: Stefan Berger
  Cc: James Bottomley, Serge E. Hallyn, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 09:30:00AM -0500, Stefan Berger wrote:
> 
> On 11/29/21 09:10, James Bottomley wrote:
> > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > On 11/29/21 07:50, James Bottomley wrote:
> > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley wrote:
> > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > event.  So, if you have a measurement policy and it measures a
> > > > > > particular binary it will not get measured again if it is
> > > > > > subsequently executed. For Namespaced IMA, the correct
> > > > > > behaviour
> > > > > > seems to be to log once per inode per namespace (so every
> > > > > > unique
> > > > > > execution in a namespace gets a separate log entry).  Since
> > > > > > logging
> > > > > > once per inode per namespace is
> > > > > I suspect I'll need to do a more in depth reading of the existing
> > > > > code, but I'll ask the lazy question anyway (since you say "the
> > > > > correct behavior seems to be") - is it actually important that
> > > > > files which were appraised under a parent namespace's policy
> > > > > already
> > > > > should be logged again?
> > > > I think so.  For a couple of reasons, assuming the namespace
> > > > eventually
> > > > gets its own log entries, which the next incremental patch proposed
> > > > to
> > > > do by virtualizing the securityfs entries.  If you don't do this:
> > > To avoid duplicate efforts, an implementation of a virtualized
> > > securityfs is in this series here:
> > > 
> > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > 
> > > It starts with 'securityfs: Prefix global variables with secruityfs_'
> > That's quite a big patch series.  I already actually implemented this
> > as part of the RFC for getting the per namespace measurement log.  The
> > attached is basically what I did.
> 
> I know it's big. I tried to avoid having to bind-mount the system-wide
> single securityfs into the container and inherit all the other security
> subsystems' files and directories (evm, TPM, safesetid, apparmor, tomoyo [
> https://elixir.bootlin.com/linux/latest/C/ident/securityfs_create_dir ]) and
> instead have a  'view' that is a bit more restricted to those subsystems
> that are namespaced. The securityfs_ns I created can be mounted into each
> user namespace individually and only shows what you're supposed to see
> without other filesystem tricks to hide files or so. It should be
> future-extensible for other subsystem to register themselves there if they
> have something to show to the user.

From a quick glance, your implementation is doing the right thing.
You're creating a keyed super via get_tree_keyed().

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 16:16                 ` Christian Brauner
@ 2021-11-29 16:23                   ` Christian Brauner
  2021-11-29 17:04                   ` Stefan Berger
  1 sibling, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-29 16:23 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: James Bottomley, Stefan Berger, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 05:16:56PM +0100, Christian Brauner wrote:
> On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
> > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > > > > On 11/29/21 07:50, James Bottomley wrote:
> > > > > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
> > > > > > > > wrote:
> > > > > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > > > > event.  So, if you have a measurement policy and it
> > > > > > > > > measures a particular binary it will not get measured again
> > > > > > > > > if it is subsequently executed. For Namespaced IMA, the
> > > > > > > > > correct behaviour seems to be to log once per inode per
> > > > > > > > > namespace (so every unique execution in a namespace gets a
> > > > > > > > > separate log entry).  Since logging once per inode per
> > > > > > > > > namespace is
> > > > > > > > I suspect I'll need to do a more in depth reading of the
> > > > > > > > existing code, but I'll ask the lazy question anyway (since
> > > > > > > > you say "the correct behavior seems to be") - is it actually
> > > > > > > > important that files which were appraised under a parent
> > > > > > > > namespace's policy already should be logged again?
> > > > > > > I think so.  For a couple of reasons, assuming the namespace
> > > > > > > eventually gets its own log entries, which the next incremental
> > > > > > > patch proposed to do by virtualizing the securityfs
> > > > > > > entries.  If you don't do this:
> > > > > > 
> > > > > > To avoid duplicate efforts, an implementation of a virtualized 
> > > > > > securityfs is in this series here:
> > > > > > 
> > > > > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > > > > 
> > > > > > It starts with 'securityfs: Prefix global variables with
> > > > > > secruityfs_'
> > > > > 
> > > > > That's quite a big patch series.  I already actually implemented
> > > > > this as part of the RFC for getting the per namespace measurement
> > > > > log.  The attached is basically what I did.
> > > > > 
> > > > > Most of the time we don't require namespacing the actual virtualfs
> > > > > file, because it's world readable.  IMA has a special requirement
> > > > > in this regard because the IMA files should be readable (and
> > > > > writeable when we get around to policy updates) by the admin of the
> > > > > namespace but their protection is 0640 or 0440.  I thought the
> > > > > simplest solution would be to make an additional flag that coped
> > > > > with the permissions and a per-inode flag way of making the file as
> > > > > "accessible by userns admin".  Doing something simple like this
> > > > > gives a much smaller diffstat:
> > > > 
> > > > That's a NAK from me. Stefan's series might be bigger but it does
> > > > things correctly. I appreciate the keep it simple attitude but no. I
> > > > won't speciale-case securityfs or similar stuff in core vfs helpers.
> > > 
> > > Well, there's a reason it's an unpublished patch.  However, the more
> > > important point is that namespacing IMA requires discussion of certain
> > > points that we never seem to drive to a conclusion.  Using the akpm
> > > method, I propose simple patches that drive the discussion.  I think
> > > the points are:
> > > 
> > >    1. Should IMA be its own namespace or tied to the user namespace?  The
> > >       previous patches all took the separate Namespace approach, but I
> > >       think that should be reconsidered now keyrings are in the user
> > >       namespace.
> > 
> > Well that purely depends on the needed scope.
> > 
> > The audit container identifier is a neat thing.  But it absolutely must
> > be settable, so seems to conflict with your needs.
> > 
> > Your patch puts an identifier on the user_namespace.  I'm not quite sure,
> > does that satisfy Stefan's needs?  A new ima ns if and only if there is a
> > new user ns?
> 
> I kept thinking about this question while I was out running and while I
> admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
> to me that this is the right approach after all. Making it part of
> userns at least in this form isn't clean.
> 
> I think attaching a uuid to a userns alone for the sake of IMA is wrong.
> Additionally, I think a uuid only for the userns is too limited. This is
> similar to the problem of the audit container id. If we have some sort
> of uuid for ima it will automatically evolve into something like a
> container id (I'm not even arguing that this is necessarily wrong.).
> We also have the issue that we then have the container audit id thing -
> if this ever lands and the ima userns uuid. All that makes it quite
> messy.
> 
> I think CLONE_NEWIMA is ultimately nicer and allows the implementation
> to be decoupled from the userns and self-contained as possible. This
> also means that ima ns works for privileged containers which sure is a
> valid use-case.
> It will also make securityfs namespacing easier as it can be a keyed
> super where the key is the ima ns (similar to how we deal with e.g.
> mqueue).

s/ima ns/userns/g

which is what Stefan already did in the link he shared.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 15:27               ` Stefan Berger
@ 2021-11-29 16:23                 ` James Bottomley
  0 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-29 16:23 UTC (permalink / raw)
  To: Stefan Berger, Christian Brauner
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Mon, 2021-11-29 at 10:27 -0500, Stefan Berger wrote:
> On 11/29/21 09:46, James Bottomley wrote:
[...]
> > And, of course, the fun ones we're coming to.
> > 
> >     1. Given that the current keyring namespacing doesn't give
> > access to the system keyrings, how do we get per-namespace access
> > for .ima/_ima system keyrings given that the namespace admin is
> > going to want to set their own key for appraisal?
> >     2. What mechanism should we use for .ima/_ima key setting?  The
> > current mechanism is must be signed by a key in the system keyrings
> > sounds appropriate, but is problematic given most system owners
> > don't actually have any private keys for keys in the system
> > keyrings. Hopefully the MoK keyring patches will help us have an
> > easier approach to this.
> 
> The approach we took in the previous implementation was to support
> BYOK (bring your own key) for every container. The (trusted)
> container runtime has to do what dracut would typically do, namely
> create the keyrings and load the keys onto it.

Right, that's why I think the staged approach is the right way to do
this.  Let's first of all try to get an IMA namespace that can be used
to measure things and then worry about trying to add appraisal to the
namespace later, since it's a separable use case.  I think they how of
appraisal is one of the more complex issues, but it shouldn't hold up
actually starting to get an IMA namespace upstream and encouraging use
cases.

> The container runtime would
> 
> 1a) expect keys to be found inside a container's filesystem at the
> usual location (/etc/keys/ima) but then also allow for a CA key that
> is used to verify the signature of those keys; that CA key is
> typically baked  into the Linux kernel when .ima is to be used, but
> for containers and BYOK it's an additional file in the container's
> filesystem
> 
> 1b) passing in keys via command line should be possible as well but 
> that's an implementation detail

I do think the namespace should use a mechanism that's as close as
possible to the one on the physical system, so it shouldn't depend on
the orchestration system doing anything (there might not even be one).

The current way you install a new ima key is by adding it as root to
_ima or by adding one signed by a key on the system keyring to .ima; I
think this mechanism should be usable inside the namespace as well ...
that's not to say there can't be other mechanisms, but mirroring the
existing mechanism should be part of this.

> 2) container runtime sets up either a restricted keyring [ 
> https://elixir.bootlin.com/linux/latest/source/Documentation/crypto/asymmetric-keys.rst#L359 
> ] if that CA key is found in the filesystem or a 'normal' keyring.
> The container runtime then loads the keys onto that keyring; call
> that keyring '.ima' or '_ima' for as long as the kernel knows what
> keyring to search for.

and this keyring should be owned by the admin of the IMA namespace.

>  We created that keyring under a session keyring. With
> the user namespace isolation and keyrings support in the user
> namespace the  isolation of the IMA related keyrings between
> different user namespaces should be possible.

Yes, I think so.  Really system keyrings are keyrings root can't create
but which are owned by root, so it can see them.  This mechanism should
carry over to the admin of the IMA namespace, so I see no reason why we
can't set up mirrors of .ima/_ima in the keyring (user) namespace.

Part of the reason why I think using the user namespace for the IMA
namespace is because this will have to be done on first create of a new
IMA namespace, so it's much easier if the keyring and IMA namespaces
are the same namespace.

> The same would be done for the IMA policy where the container
> runtime also needs to do some work that usually dracut would do:
> 
> - expect the IMA policy for the container at the usual location 
> (/etc/ima/ima-policy) and load it into the container's 'securityfs' 
> policy file

Right, since non system containers don't boot, we discount the kernel
command line option.  Once we have the securityfs namespaced for the
IMA files, doing the policy write should be trivial.  The next question
would be how to process it so that we don't cause a security breach
(like turn of measurement on a filesystem where the original policy
required it).

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 15:35               ` Serge E. Hallyn
  2021-11-29 16:07                 ` Stefan Berger
  2021-11-29 16:16                 ` Christian Brauner
@ 2021-11-29 16:44                 ` James Bottomley
  2021-11-30  4:59                   ` Serge E. Hallyn
  2 siblings, 1 reply; 46+ messages in thread
From: James Bottomley @ 2021-11-29 16:44 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Christian Brauner, Stefan Berger, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, 2021-11-29 at 09:35 -0600, Serge E. Hallyn wrote:
> On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
[...]
> > Well, there's a reason it's an unpublished patch.  However, the
> > more important point is that namespacing IMA requires discussion of
> > certain points that we never seem to drive to a conclusion.  Using
> > the akpm method, I propose simple patches that drive the
> > discussion.  I think the points are:
> > 
> >    1. Should IMA be its own namespace or tied to the user
> > namespace?  The previous patches all took the separate Namespace
> > approach, but I think that should be reconsidered now keyrings are
> > in the user namespace.
> 
> Well that purely depends on the needed scope.
> 
> The audit container identifier is a neat thing.  But it absolutely
> must be settable, so seems to conflict with your needs.

I think not allowing duplicate entries for the lifetime of the log is
required, which causes a problem since namespaces can die before this
lifetime ends.  I think there is a nice security benefit in making it
not user settable, but I don't think that's necessarily a requirement.

> Your patch puts an identifier on the user_namespace.  I'm not quite
> sure, does that satisfy Stefan's needs?  A new ima ns if and only if
> there is a new user ns?

Part of the problem is that IMA needs an admin user ... to be able to
read the log and set the policy and, when we get to appraisal, set and
read the keyrings.  IMA NS iff user ns satisfies this, but the
minimalist in me then asks why not make them the same thing?

> I think you two need to get together and discuss the requirements,
> and come back with a brief but very precise document explaining what
> you need.

> Are you both looking at the same use case?  Who is consuming the
> audit log, and to what end?  Container administrators?  Any time they
> log in? How do they assure themselves that the securityfs file
> they're reading hasn't been overmounted?

There are several different use cases.  I'm asking how I would use the
IMA namespace for the unprivileged containers I usually set up by
hand.  Stefan is looking at how docker/kubernetes would do it.  There's
also the Huawei use case which is a sort of attestation for function as
a service and there's the Red Hat use case for OpenShift.

However, the common denominator in all of these is they require a way
to uniquely distinguish the containers, which is why the patch series I
sent as an RFC starts that way.  If we can start with the common
elements, we can build towards something that satisfies all the use
cases ... and allow consensus to emerge as further patches are
discussed.

Part of my problem is I don't really know what I need, I just want IMA
namespaces to work easily for the unprivileged use case and I'll figure
it out as I play with it ... but to do that I need something to start
playing with.

> I need to find a document to read about IMA's usage of PCRs.  For
> namespacing, are you expecting each container to be hooked up to a
> swtmp instance so they have their own PCR they can use?

That's one of the most complicated things of all: trying to attach a
vTPM to store the local extensions and quote the IMA log in the
namespace.  The Huawei patch doesn't have it, because they don't really
need it (they just want global attestation to work), the IBM patches
do.  I think there are many ways of attesting to the subset of the log
in the namespace, so I don't think a vTPM is required.  I do, however,
think it should be supported for the use cases that need it.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 16:16                 ` Christian Brauner
  2021-11-29 16:23                   ` Christian Brauner
@ 2021-11-29 17:04                   ` Stefan Berger
  2021-11-29 17:29                     ` James Bottomley
                                       ` (2 more replies)
  1 sibling, 3 replies; 46+ messages in thread
From: Stefan Berger @ 2021-11-29 17:04 UTC (permalink / raw)
  To: Christian Brauner, Serge E. Hallyn
  Cc: James Bottomley, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner


On 11/29/21 11:16, Christian Brauner wrote:
> On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
>> On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
>>> On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
>>>> On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
>>>>
> I kept thinking about this question while I was out running and while I
> admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
> to me that this is the right approach after all. Making it part of
> userns at least in this form isn't clean.
>
> I think attaching a uuid to a userns alone for the sake of IMA is wrong.
> Additionally, I think a uuid only for the userns is too limited. This is
> similar to the problem of the audit container id. If we have some sort
> of uuid for ima it will automatically evolve into something like a
> container id (I'm not even arguing that this is necessarily wrong.).
> We also have the issue that we then have the container audit id thing -
> if this ever lands and the ima userns uuid. All that makes it quite
> messy.
>
> I think CLONE_NEWIMA is ultimately nicer and allows the implementation
> to be decoupled from the userns and self-contained as possible. This
> also means that ima ns works for privileged containers which sure is a
> valid use-case.

The thing is that piggy backing on the user namespace at least allows us 
to 'always see' where IMA's keyring is (across setns()). If we were 
using an independent IMA namespace how would we guarantee that the user 
sees the keyring for IMA appraisal? We would at least have to take a 
reference (as in get_user_ns()) to the user namespace when the IMA 
namespace is created so that it at least the association of IMA 
namespace to user namespace remains a constant and the keyring IMA is 
using (and are held by the user namespace) is also constant across 
setns()'s. Otherwise it is possible that the user could do setns() to a 
different user namespace and be confused about the keys IMA is using. So 
at least by piggy backing we have them together. The aspect here may be 
'usability'.

I am somewhat sold on the USER namespace piggy backing thing... as 
suggested by James.


> It will also make securityfs namespacing easier as it can be a keyed
> super where the key is the ima ns (similar to how we deal with e.g.
> mqueue).

Yes, mqueue is where I got the (API usage) idea from how to switch out 
the reference to the user namespace needed for the 'keyed' approach.

I will massage my 20 patch-series a bit more and then post it (for 
reference....). It does have 'somewhat dramatic' things in there that 
stem from virtualizing securityfs for example and IMA being a dependent 
of the user namespace and taking one more reference to the user 
namespace (it is a dependent of) and then the user namespace not being 
able to easily delete:

It's this here to help with an early tear-down to decrease the reference 
counter.

- 
https://github.com/stefanberger/linux-ima-namespaces/commit/1a5d7f2598764ca6f1a8c5a391672543fef83f2c

- 
https://github.com/stefanberger/linux-ima-namespaces/commit/d246f501f977e64333ecbd8bb79994e23b552b9b

- 
https://github.com/stefanberger/linux-ima-namespaces/commit/3b82058936862d7623b3a06bc1749d5efc018ab1#diff-99458ca9139231ac3811dbb0c0fced442c46c7cfdb94e86e4553fc0329d3a79bR647-R651

The teardown variable makes this a controlled process but ... is it 
acceptable?

    Stefan



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 17:04                   ` Stefan Berger
@ 2021-11-29 17:29                     ` James Bottomley
  2021-11-30  5:03                     ` Serge E. Hallyn
  2021-11-30 13:38                     ` Christian Brauner
  2 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-29 17:29 UTC (permalink / raw)
  To: Stefan Berger, Christian Brauner, Serge E. Hallyn
  Cc: linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, 2021-11-29 at 12:04 -0500, Stefan Berger wrote:
> On 11/29/21 11:16, Christian Brauner wrote:
[...]
> > I kept thinking about this question while I was out running and
> > while I admittedly have reacted poorly to CLONE_NEWIMA patches
> > before it feels to me that this is the right approach after all.
> > Making it part of userns at least in this form isn't clean.
> > 
> > I think attaching a uuid to a userns alone for the sake of IMA is
> > wrong. Additionally, I think a uuid only for the userns is too
> > limited. This is similar to the problem of the audit container id.
> > If we have some sort of uuid for ima it will automatically evolve
> > into something like a container id (I'm not even arguing that this
> > is necessarily wrong.). We also have the issue that we then have
> > the container audit id thing - if this ever lands and the ima
> > userns uuid. All that makes it quite messy.
> > 
> > I think CLONE_NEWIMA is ultimately nicer and allows the
> > implementation to be decoupled from the userns and self-contained
> > as possible.  This also means that ima ns works for privileged
> > containers which sure is a valid use-case.
> 
> The thing is that piggy backing on the user namespace at least allows
> us to 'always see' where IMA's keyring is (across setns()). If we
> were  using an independent IMA namespace how would we guarantee that
> the user sees the keyring for IMA appraisal? We would at least have
> to take a  reference (as in get_user_ns()) to the user namespace when
> the IMA namespace is created so that it at least the association of
> IMA namespace to user namespace remains a constant and the keyring
> IMA is using (and are held by the user namespace) is also constant
> across setns()'s. Otherwise it is possible that the user could do
> setns() to a different user namespace and be confused about the keys
> IMA is using. So at least by piggy backing we have them together. The
> aspect here may be 'usability'.
> 
> I am somewhat sold on the USER namespace piggy backing thing... as 
> suggested by James.

And to be clear, I don't think a separate IMA namespace is necessarily
wrong a priori.  However, all the original patch sets had the separate
namespace approach and I worry that the keyring namespace in user
namespace has forced our hand somewhat, so I wanted at least to post a
semi-usable patch set showing what would happen if the IMA namespace
were the user namespace for illustration so it would serve as a basis
for discussion.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 16:07                 ` Stefan Berger
@ 2021-11-30  4:42                   ` Serge E. Hallyn
  0 siblings, 0 replies; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-30  4:42 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Serge E. Hallyn, James Bottomley, Christian Brauner,
	linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, Nov 29, 2021 at 11:07:18AM -0500, Stefan Berger wrote:
> 
> On 11/29/21 10:35, Serge E. Hallyn wrote:
> > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > > On Mon, 2021-11-29 at 08:53 -0500, Stefan Berger wrote:
> > > > > > On 11/29/21 07:50, James Bottomley wrote:
> > > > > > > On Sun, 2021-11-28 at 22:58 -0600, Serge E. Hallyn wrote:
> > > > > > > > On Sat, Nov 27, 2021 at 04:45:49PM +0000, James Bottomley
> > > > > > > > wrote:
> > > > > > > > > Currently we get one entry in the IMA log per unique file
> > > > > > > > > event.  So, if you have a measurement policy and it
> > > > > > > > > measures a particular binary it will not get measured again
> > > > > > > > > if it is subsequently executed. For Namespaced IMA, the
> > > > > > > > > correct behaviour seems to be to log once per inode per
> > > > > > > > > namespace (so every unique execution in a namespace gets a
> > > > > > > > > separate log entry).  Since logging once per inode per
> > > > > > > > > namespace is
> > > > > > > > I suspect I'll need to do a more in depth reading of the
> > > > > > > > existing code, but I'll ask the lazy question anyway (since
> > > > > > > > you say "the correct behavior seems to be") - is it actually
> > > > > > > > important that files which were appraised under a parent
> > > > > > > > namespace's policy already should be logged again?
> > > > > > > I think so.  For a couple of reasons, assuming the namespace
> > > > > > > eventually gets its own log entries, which the next incremental
> > > > > > > patch proposed to do by virtualizing the securityfs
> > > > > > > entries.  If you don't do this:
> > > > > > To avoid duplicate efforts, an implementation of a virtualized
> > > > > > securityfs is in this series here:
> > > > > > 
> > > > > > https://github.com/stefanberger/linux-ima-namespaces/commits/v5.15%2Bimans.20211119.v3
> > > > > > 
> > > > > > It starts with 'securityfs: Prefix global variables with
> > > > > > secruityfs_'
> > > > > That's quite a big patch series.  I already actually implemented
> > > > > this as part of the RFC for getting the per namespace measurement
> > > > > log.  The attached is basically what I did.
> > > > > 
> > > > > Most of the time we don't require namespacing the actual virtualfs
> > > > > file, because it's world readable.  IMA has a special requirement
> > > > > in this regard because the IMA files should be readable (and
> > > > > writeable when we get around to policy updates) by the admin of the
> > > > > namespace but their protection is 0640 or 0440.  I thought the
> > > > > simplest solution would be to make an additional flag that coped
> > > > > with the permissions and a per-inode flag way of making the file as
> > > > > "accessible by userns admin".  Doing something simple like this
> > > > > gives a much smaller diffstat:
> > > > That's a NAK from me. Stefan's series might be bigger but it does
> > > > things correctly. I appreciate the keep it simple attitude but no. I
> > > > won't speciale-case securityfs or similar stuff in core vfs helpers.
> > > Well, there's a reason it's an unpublished patch.  However, the more
> > > important point is that namespacing IMA requires discussion of certain
> > > points that we never seem to drive to a conclusion.  Using the akpm
> > > method, I propose simple patches that drive the discussion.  I think
> > > the points are:
> > > 
> > >     1. Should IMA be its own namespace or tied to the user namespace?  The
> > >        previous patches all took the separate Namespace approach, but I
> > >        think that should be reconsidered now keyrings are in the user
> > >        namespace.
> > Well that purely depends on the needed scope.
> > 
> > The audit container identifier is a neat thing.  But it absolutely must
> > be settable, so seems to conflict with your needs.
> > 
> > Your patch puts an identifier on the user_namespace.  I'm not quite sure,
> > does that satisfy Stefan's needs?  A new ima ns if and only if there is a
> > new user ns?
> > 
> > I think you two need to get together and discuss the requirements, and come
> > back with a brief but very precise document explaining what you need.
> 
> What would those want who look at audit messages? [Idk] Would they want a

Oh, I think you may have misunderstood me.  I meant you and James.  I
didn't mean that you shoule have a common implementation with the
container-audit patchset.

> constant identifier for IMA audit messages in the audit log across all
> restarts of a container? Presumably that would make quick queries across
> restarts much easier. Or could they live with an audit message emitted from
> the container runtime indicating that this time the (IMA) audit messages
> from this container will have this UUID here?
> 
> I guess both would 'work.'
> 
> > 
> > Are you both looking at the same use case?  Who is consuming the audit
> > log, and to what end?  Container administrators?  Any time they log in?
> > How do they assure themselves that the securityfs file they're reading
> > hasn't been overmounted?
> 
> The question is also should there only be one identifier or can there be two
> different one (one from audit patch series and uuid of user namespace).
> 
> 
> > 
> > I need to find a document to read about IMA's usage of PCRs.  For
> > namespacing, are you expecting each container to be hooked up to a
> > swtmp instance so they have their own PCR they can use?
> 
> It's complicated and there's a bit more to this... I would try to architect
> it in a way that the IMA system policy can cover what's going on inside IMA
> namespaces, i.e., audit and measure and appraise file accesses occurring in
> those namespace. We call it hierarchical processing ( https://github.com/stefanberger/linux-ima-namespaces/commit/e88dc84ec97753fd65d302ee1bf03951001ab48f
> ) where file access are evaluated against the current namespace's policy and
> then also evaluated against those of parent namespaces back to the
> init_ima_ns. The goal is to avoid evasion of measurements etc. by the user
> just by spawning new IMA namespaces. I think logging into the IMA system log
> will not scale well if there are hundreds of containers on the system using
> IMA and logging into the system log and hammering the TPM. So, the answer
> then is write your policy in such a way that it doesn't cover the IMA/user
> namespaces (containers) and have each container have its own IMA policy and
> IMA log and and an optional vTPM. So my answer would be 'optional swtpm.'
> 
>    Stefan
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 16:44                 ` James Bottomley
@ 2021-11-30  4:59                   ` Serge E. Hallyn
  2021-11-30 13:00                     ` James Bottomley
  0 siblings, 1 reply; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-30  4:59 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, Christian Brauner, Stefan Berger,
	linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, Nov 29, 2021 at 11:44:35AM -0500, James Bottomley wrote:
> On Mon, 2021-11-29 at 09:35 -0600, Serge E. Hallyn wrote:
> > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> [...]
> > > Well, there's a reason it's an unpublished patch.  However, the
> > > more important point is that namespacing IMA requires discussion of
> > > certain points that we never seem to drive to a conclusion.  Using
> > > the akpm method, I propose simple patches that drive the
> > > discussion.  I think the points are:
> > > 
> > >    1. Should IMA be its own namespace or tied to the user
> > > namespace?  The previous patches all took the separate Namespace
> > > approach, but I think that should be reconsidered now keyrings are
> > > in the user namespace.
> > 
> > Well that purely depends on the needed scope.
> > 
> > The audit container identifier is a neat thing.  But it absolutely
> > must be settable, so seems to conflict with your needs.
> 
> I think not allowing duplicate entries for the lifetime of the log is
> required, which causes a problem since namespaces can die before this
> lifetime ends.  I think there is a nice security benefit in making it
> not user settable, but I don't think that's necessarily a requirement.
> 
> > Your patch puts an identifier on the user_namespace.  I'm not quite
> > sure, does that satisfy Stefan's needs?  A new ima ns if and only if
> > there is a new user ns?
> 
> Part of the problem is that IMA needs an admin user ... to be able to
> read the log and set the policy and, when we get to appraisal, set and
> read the keyrings.  IMA NS iff user ns satisfies this, but the
> minimalist in me then asks why not make them the same thing?
> 
> > I think you two need to get together and discuss the requirements,
> > and come back with a brief but very precise document explaining what
> > you need.
> 
> > Are you both looking at the same use case?  Who is consuming the
> > audit log, and to what end?  Container administrators?  Any time they
> > log in? How do they assure themselves that the securityfs file
> > they're reading hasn't been overmounted?
> 
> There are several different use cases.  I'm asking how I would use the
> IMA namespace for the unprivileged containers I usually set up by
> hand.  Stefan is looking at how docker/kubernetes would do it.  There's
> also the Huawei use case which is a sort of attestation for function as
> a service and there's the Red Hat use case for OpenShift.
> 
> However, the common denominator in all of these is they require a way
> to uniquely distinguish the containers, which is why the patch series I
> sent as an RFC starts that way.  If we can start with the common
> elements, we can build towards something that satisfies all the use
> cases ... and allow consensus to emerge as further patches are
> discussed.

The reason I asked this question in response to this patch is because
here I'm not picking at the userns->uuid, but rather it's the new linked
lists for the inode that feel wrong.  So if you can get what you need
some other way - maybe just "we opened all these files and got no
integrity failure messages", or a hash table keyed on (userns *, inode *)
instead of the linked lists to look up whether an inode has been measured,
or some userspace daemon to resubmit already logged approvals, which I
gather won't work for unpriv containers - that would be nice.

> Part of my problem is I don't really know what I need, I just want IMA
> namespaces to work easily for the unprivileged use case and I'll figure
> it out as I play with it ... but to do that I need something to start
> playing with.

But for that kind of research you use an out of tree patchset, not
speculative infrastructure in the kernel.  If that's what this
patchset is, then I'll (feel a little silly and) look over it with a
different set of eyes :)

> > I need to find a document to read about IMA's usage of PCRs.  For
> > namespacing, are you expecting each container to be hooked up to a
> > swtmp instance so they have their own PCR they can use?
> 
> That's one of the most complicated things of all: trying to attach a
> vTPM to store the local extensions and quote the IMA log in the
> namespace.  The Huawei patch doesn't have it, because they don't really
> need it (they just want global attestation to work), the IBM patches
> do.  I think there are many ways of attesting to the subset of the log
> in the namespace, so I don't think a vTPM is required.  I do, however,
> think it should be supported for the use cases that need it.
> 
> James
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 17:04                   ` Stefan Berger
  2021-11-29 17:29                     ` James Bottomley
@ 2021-11-30  5:03                     ` Serge E. Hallyn
  2021-11-30 11:55                       ` Stefan Berger
  2021-11-30 13:44                       ` Christian Brauner
  2021-11-30 13:38                     ` Christian Brauner
  2 siblings, 2 replies; 46+ messages in thread
From: Serge E. Hallyn @ 2021-11-30  5:03 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Christian Brauner, Serge E. Hallyn, James Bottomley,
	linux-integrity, containers, Mimi Zohar, Dmitry Kasatkin,
	Eric W . Biederman, krzysztof.struczynski, Roberto Sassu,
	Michael Peters, Luke Hinds, Lily Sturmann, Patrick Uiterwijk,
	Christian Brauner

On Mon, Nov 29, 2021 at 12:04:29PM -0500, Stefan Berger wrote:
> 
> On 11/29/21 11:16, Christian Brauner wrote:
> > On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
> > > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > > > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > > 
> > I kept thinking about this question while I was out running and while I
> > admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
> > to me that this is the right approach after all. Making it part of
> > userns at least in this form isn't clean.
> > 
> > I think attaching a uuid to a userns alone for the sake of IMA is wrong.
> > Additionally, I think a uuid only for the userns is too limited. This is
> > similar to the problem of the audit container id. If we have some sort
> > of uuid for ima it will automatically evolve into something like a
> > container id (I'm not even arguing that this is necessarily wrong.).
> > We also have the issue that we then have the container audit id thing -
> > if this ever lands and the ima userns uuid. All that makes it quite
> > messy.
> > 
> > I think CLONE_NEWIMA is ultimately nicer and allows the implementation
> > to be decoupled from the userns and self-contained as possible. This
> > also means that ima ns works for privileged containers which sure is a
> > valid use-case.
> 
> The thing is that piggy backing on the user namespace at least allows us to
> 'always see' where IMA's keyring is (across setns()). If we were using an
> independent IMA namespace how would we guarantee that the user sees the
> keyring for IMA appraisal? We would at least have to take a reference (as in
> get_user_ns()) to the user namespace when the IMA namespace is created so
> that it at least the association of IMA namespace to user namespace remains

Maybe we pull they keyring info into a new struct which is referred
to and pinned by both user_ns and ima_ns?  (But I actually am ignorant
of how ima is using the keyrings, so again I need to go do some reading.)

More moving parts isn't my first choice.  But if you need namespaced IMA
for containers that aren't doing CLONE_NEWUSER, then a separate ima_ns is
your only option.  Is that a requirement for you?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-30  5:03                     ` Serge E. Hallyn
@ 2021-11-30 11:55                       ` Stefan Berger
  2021-11-30 13:33                         ` Christian Brauner
  2021-11-30 13:44                       ` Christian Brauner
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Berger @ 2021-11-30 11:55 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Christian Brauner, James Bottomley, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner


On 11/30/21 00:03, Serge E. Hallyn wrote:
> On Mon, Nov 29, 2021 at 12:04:29PM -0500, Stefan Berger wrote:
>> On 11/29/21 11:16, Christian Brauner wrote:
>>> On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
>>>> On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
>>>>> On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
>>>>>> On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
>>>>>>
>>> I kept thinking about this question while I was out running and while I
>>> admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
>>> to me that this is the right approach after all. Making it part of
>>> userns at least in this form isn't clean.
>>>
>>> I think attaching a uuid to a userns alone for the sake of IMA is wrong.
>>> Additionally, I think a uuid only for the userns is too limited. This is
>>> similar to the problem of the audit container id. If we have some sort
>>> of uuid for ima it will automatically evolve into something like a
>>> container id (I'm not even arguing that this is necessarily wrong.).
>>> We also have the issue that we then have the container audit id thing -
>>> if this ever lands and the ima userns uuid. All that makes it quite
>>> messy.
>>>
>>> I think CLONE_NEWIMA is ultimately nicer and allows the implementation
>>> to be decoupled from the userns and self-contained as possible. This
>>> also means that ima ns works for privileged containers which sure is a
>>> valid use-case.
>> The thing is that piggy backing on the user namespace at least allows us to
>> 'always see' where IMA's keyring is (across setns()). If we were using an
>> independent IMA namespace how would we guarantee that the user sees the
>> keyring for IMA appraisal? We would at least have to take a reference (as in
>> get_user_ns()) to the user namespace when the IMA namespace is created so
>> that it at least the association of IMA namespace to user namespace remains
> Maybe we pull they keyring info into a new struct which is referred
> to and pinned by both user_ns and ima_ns?  (But I actually am ignorant
> of how ima is using the keyrings, so again I need to go do some reading.)
>
> More moving parts isn't my first choice.  But if you need namespaced IMA
> for containers that aren't doing CLONE_NEWUSER, then a separate ima_ns is
> your only option.  Is that a requirement for you?

I cannot think of a scenario where a user wouldn't want/refuse to open a 
user namespace to get an IMA namespace...



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-30  4:59                   ` Serge E. Hallyn
@ 2021-11-30 13:00                     ` James Bottomley
  0 siblings, 0 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-30 13:00 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Christian Brauner, Stefan Berger, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, 2021-11-29 at 22:59 -0600, Serge E. Hallyn wrote:
> On Mon, Nov 29, 2021 at 11:44:35AM -0500, James Bottomley wrote:
> > On Mon, 2021-11-29 at 09:35 -0600, Serge E. Hallyn wrote:
> > > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > [...]
> > > > Well, there's a reason it's an unpublished patch.  However, the
> > > > more important point is that namespacing IMA requires
> > > > discussion of certain points that we never seem to drive to a
> > > > conclusion.  Using the akpm method, I propose simple patches
> > > > that drive the discussion.  I think the points are:
> > > > 
> > > >    1. Should IMA be its own namespace or tied to the user
> > > > namespace?  The previous patches all took the separate
> > > > Namespace approach, but I think that should be reconsidered now
> > > > keyrings are in the user namespace.
> > > 
> > > Well that purely depends on the needed scope.
> > > 
> > > The audit container identifier is a neat thing.  But it
> > > absolutely must be settable, so seems to conflict with your
> > > needs.
> > 
> > I think not allowing duplicate entries for the lifetime of the log
> > is required, which causes a problem since namespaces can die before
> > this lifetime ends.  I think there is a nice security benefit in
> > making it not user settable, but I don't think that's necessarily a
> > requirement.
> > 
> > > Your patch puts an identifier on the user_namespace.  I'm not
> > > quite sure, does that satisfy Stefan's needs?  A new ima ns if
> > > and only if there is a new user ns?
> > 
> > Part of the problem is that IMA needs an admin user ... to be able
> > to read the log and set the policy and, when we get to appraisal,
> > set and read the keyrings.  IMA NS iff user ns satisfies this, but
> > the minimalist in me then asks why not make them the same thing?
> > 
> > > I think you two need to get together and discuss the
> > > requirements, and come back with a brief but very precise
> > > document explaining what you need. Are you both looking at the
> > > same use case?  Who is consuming the audit log, and to what
> > > end?  Container administrators?  Any time they log in? How do
> > > they assure themselves that the securityfs file they're reading
> > > hasn't been overmounted?
> > 
> > There are several different use cases.  I'm asking how I would use
> > the IMA namespace for the unprivileged containers I usually set up
> > by hand.  Stefan is looking at how docker/kubernetes would do
> > it.  There's also the Huawei use case which is a sort of
> > attestation for function as a service and there's the Red Hat use
> > case for OpenShift.
> > 
> > However, the common denominator in all of these is they require a
> > way to uniquely distinguish the containers, which is why the patch
> > series I sent as an RFC starts that way.  If we can start with the
> > common elements, we can build towards something that satisfies all
> > the use cases ... and allow consensus to emerge as further patches
> > are discussed.
> 
> The reason I asked this question in response to this patch is because
> here I'm not picking at the userns->uuid, but rather it's the new
> linked lists for the inode that feel wrong.

Well that one's fully separable.  The first two patches could be
complete for this round.  However, if you believe there has to be one
entry per inode per namespace then the iint cache needs to accommodate
that.  The measurement is a function of the inode alone: if the inode
doesn't change that value is the same regardless of namespace.  it's
only whether it's been logged that's really per namespace, hence the
namespace list hanging off the iint entry ... if there were a huge
number of namespaces, perhaps it should be a btree instead of a list,
but it needs to be some type of two level thing, I think?

The design of patch 3 is mostly to get people to think about what
should be in the per namespace log.

>   So if you can get what you need some other way - maybe just "we
> opened all these files and got no integrity failure messages", or a
> hash table keyed on (userns *, inode *) instead of the linked lists
> to look up whether an inode has been measured, or some userspace
> daemon to resubmit already logged approvals, which I gather won't
> work for unpriv containers - that would be nice.

I could do a separate (userns *, inode *) btree for the measured part,
but we'd still have to have the per inode store of the measurement
because that's namespace blind.  Given this, it seems inefficient not
to make use of the tie between the two.

> > Part of my problem is I don't really know what I need, I just want
> > IMA namespaces to work easily for the unprivileged use case and
> > I'll figure it out as I play with it ... but to do that I need
> > something to start playing with.
> 
> But for that kind of research you use an out of tree patchset, not
> speculative infrastructure in the kernel.  If that's what this
> patchset is, then I'll (feel a little silly and) look over it with a
> different set of eyes :)

Well, no, you're looking with the right set of eyes.  The design of
this patch set is to begin incrementally with the pieces everyone can
agree on, so start as small as it can be: the namespace and a label as
a trial balloon.  If everyone had agreed it looked OK, there would be
no reason not to put it upstream and start on the next step.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 1/3] userns: add uuid field
  2021-11-29 13:56                       ` Christian Brauner
  2021-11-29 14:19                         ` Stefan Berger
@ 2021-11-30 13:09                         ` James Bottomley
  1 sibling, 0 replies; 46+ messages in thread
From: James Bottomley @ 2021-11-30 13:09 UTC (permalink / raw)
  To: Christian Brauner, Stefan Berger
  Cc: Serge E. Hallyn, linux-integrity, containers, Mimi Zohar,
	Dmitry Kasatkin, Eric W . Biederman, krzysztof.struczynski,
	Roberto Sassu, Michael Peters, Luke Hinds, Lily Sturmann,
	Patrick Uiterwijk, Christian Brauner

On Mon, 2021-11-29 at 14:56 +0100, Christian Brauner wrote:
> On Mon, Nov 29, 2021 at 08:49:40AM -0500, Stefan Berger wrote:
> > On 11/28/21 20:59, Serge E. Hallyn wrote:
> > > On Sun, Nov 28, 2021 at 05:56:28PM -0500, James Bottomley wrote:
> > > > On Sun, 2021-11-28 at 15:49 -0600, Serge E. Hallyn wrote:
> > > > > On Sun, Nov 28, 2021 at 04:21:29PM -0500, James Bottomley
> > > > > wrote:
> > > > > > On Sun, 2021-11-28 at 14:47 -0600, Serge E. Hallyn wrote:
> > > > > > > On Sun, Nov 28, 2021 at 01:00:28PM -0500, James Bottomley
> > > > > > > wrote:
> > > > > > > > On Sun, 2021-11-28 at 09:18 -0600, Serge E. Hallyn
> > > > > > > > wrote:
> > > > [...]
> > > > > > > > > So given that 'unique per boot' is sufficient, what
> > > > > > > > > would be the problem with simply adding a simple
> > > > > > > > > ever-increasing unique atomix count to the struct
> > > > > > > > > user_namespace?
> > > > > > > > I don't think there is any ... but I equally don't see
> > > > > > > > why people would want to save and restore the uuid but
> > > > > > > > not the new monotonic identifier ... because it's still
> > > > > > > > just a marker on a namespace.
> > > > > > > But you've called it "the namespace uuid".  I'm not even
> > > > > > > really thinking of checkpoint/restart, just stopping and
> > > > > > > restarting a container.  I'm convinced people will want
> > > > > > > to start using it because, well, it is a nice feature.
> > > > > > Right, but the uniqueness property depends on you not being
> > > > > > able to set it.  If you just want a namespace label, you
> > > > > > can have that, but anything a user can set is either a pain
> > > > > > to guarantee uniqueness (have to check all the other
> > > > > > objects) or is simply a non-unique label.
> > > > > > 
> > > > > > If you want to label a container, which could have many
> > > > > > namespaces and be stopped and restarted many times, it does
> > > > > > sound like you want a non-unique settable label.  However,
> > > > > > IMA definitely needs a guaranteed per namespace unique
> > > > > > label.
> > > > > > 
> > > > > > Is the objection simply you think a UUID sound like it
> > > > > > should be
> > > > > Objection is too strong.  Concern.
> > > > > 
> > > > > But yes, to me a uuid (a) feels like it should be generally
> > > > > useful including being settable and (b) not super duper 100%
> > > > > absolutely guaranteed to always be unique per boot, as an
> > > > > incremented counter would be.
> > > > OK, but a bunch of cats I found on the Internet agree with me,
> > > > a UUID shouldn't be settable:
> > > > 
> > > > https://en.wikipedia.org/wiki/Universally_unique_identifier
> > > > 
> > > > The key point being, if you can set the id, it can't be unique
> > > > ... it
> > > Ok, so can you just put a comment above there saying "this must
> > > not be settable from userspace" ?
> > 
> > So I have been working on an IMA namespacing series again as well
> > and would like to use some sort of unique identifier for audit
> > messages emitted from an IMA/user namespace other than the
> > init_ima_ns. This UUID may just work for this, but how would one
> > associate the UUID with a container if it ever comes to that when
> > evaluating audit logs? Shouldn't it be settable from user
> > space for some sort of 'coordination' between container runtime and
> > kernel?
> 
> Wouldn't this be solved by the audit container id patchset? In fact,
> can't we use this for IMA as well?

Stefan asked, but it really doesn't have the properties we need, plus
they don't seem to want the audit id used as the container id.

How about this:  Since the label has to be unique for the lifetime of
the system, if we allow it to be settable, we'll have to carry it
outside the namespace anyway because memory of the label has to live on
after the namespace dies to avoid duplication, so I'll move it into a
parallel namespace structure ima will carry.  it will be settable once,
but if you don't set it before it's used, then we'll set it to a
randombly generated uuid.  If you do set it, it will be checked for
uniqueness against all previous labels.

James



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-30 11:55                       ` Stefan Berger
@ 2021-11-30 13:33                         ` Christian Brauner
  0 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-30 13:33 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Serge E. Hallyn, James Bottomley, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Tue, Nov 30, 2021 at 06:55:51AM -0500, Stefan Berger wrote:
> 
> On 11/30/21 00:03, Serge E. Hallyn wrote:
> > On Mon, Nov 29, 2021 at 12:04:29PM -0500, Stefan Berger wrote:
> > > On 11/29/21 11:16, Christian Brauner wrote:
> > > > On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
> > > > > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > > > > > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > > > > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > > > > 
> > > > I kept thinking about this question while I was out running and while I
> > > > admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
> > > > to me that this is the right approach after all. Making it part of
> > > > userns at least in this form isn't clean.
> > > > 
> > > > I think attaching a uuid to a userns alone for the sake of IMA is wrong.
> > > > Additionally, I think a uuid only for the userns is too limited. This is
> > > > similar to the problem of the audit container id. If we have some sort
> > > > of uuid for ima it will automatically evolve into something like a
> > > > container id (I'm not even arguing that this is necessarily wrong.).
> > > > We also have the issue that we then have the container audit id thing -
> > > > if this ever lands and the ima userns uuid. All that makes it quite
> > > > messy.
> > > > 
> > > > I think CLONE_NEWIMA is ultimately nicer and allows the implementation
> > > > to be decoupled from the userns and self-contained as possible. This
> > > > also means that ima ns works for privileged containers which sure is a
> > > > valid use-case.
> > > The thing is that piggy backing on the user namespace at least allows us to
> > > 'always see' where IMA's keyring is (across setns()). If we were using an
> > > independent IMA namespace how would we guarantee that the user sees the
> > > keyring for IMA appraisal? We would at least have to take a reference (as in
> > > get_user_ns()) to the user namespace when the IMA namespace is created so
> > > that it at least the association of IMA namespace to user namespace remains
> > Maybe we pull they keyring info into a new struct which is referred
> > to and pinned by both user_ns and ima_ns?  (But I actually am ignorant
> > of how ima is using the keyrings, so again I need to go do some reading.)
> > 
> > More moving parts isn't my first choice.  But if you need namespaced IMA
> > for containers that aren't doing CLONE_NEWUSER, then a separate ima_ns is
> > your only option.  Is that a requirement for you?
> 
> I cannot think of a scenario where a user wouldn't want/refuse to open a
> user namespace to get an IMA namespace...

The point is that unprivileged containers are still rare. And for quitea
few workloads they won't go away. Especially for a lot of Kubernetes
workloads - even if it finally has userns support - won't be able to
migrate to unprivileged. And I would think that they would want ima
support as well. But utlimately that's something the integrity folks
have to decide. If you decide that not having this feature for
non-CLONE_NEWUSER workloads then so be it.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-29 17:04                   ` Stefan Berger
  2021-11-29 17:29                     ` James Bottomley
  2021-11-30  5:03                     ` Serge E. Hallyn
@ 2021-11-30 13:38                     ` Christian Brauner
  2 siblings, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-30 13:38 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Serge E. Hallyn, James Bottomley, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 12:04:29PM -0500, Stefan Berger wrote:
> 
> On 11/29/21 11:16, Christian Brauner wrote:
> > On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
> > > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > > > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > > 
> > I kept thinking about this question while I was out running and while I
> > admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
> > to me that this is the right approach after all. Making it part of
> > userns at least in this form isn't clean.
> > 
> > I think attaching a uuid to a userns alone for the sake of IMA is wrong.
> > Additionally, I think a uuid only for the userns is too limited. This is
> > similar to the problem of the audit container id. If we have some sort
> > of uuid for ima it will automatically evolve into something like a
> > container id (I'm not even arguing that this is necessarily wrong.).
> > We also have the issue that we then have the container audit id thing -
> > if this ever lands and the ima userns uuid. All that makes it quite
> > messy.
> > 
> > I think CLONE_NEWIMA is ultimately nicer and allows the implementation
> > to be decoupled from the userns and self-contained as possible. This
> > also means that ima ns works for privileged containers which sure is a
> > valid use-case.
> 
> The thing is that piggy backing on the user namespace at least allows us to
> 'always see' where IMA's keyring is (across setns()). If we were using an
> independent IMA namespace how would we guarantee that the user sees the
> keyring for IMA appraisal? We would at least have to take a reference (as in
> get_user_ns()) to the user namespace when the IMA namespace is created so
> that it at least the association of IMA namespace to user namespace remains
> a constant and the keyring IMA is using (and are held by the user namespace)

I don't think that this needs to be any different from other namespaces
that have an owning userns.

> is also constant across setns()'s. Otherwise it is possible that the user
> could do setns() to a different user namespace and be confused about the
> keys IMA is using. So at least by piggy backing we have them together. The
> aspect here may be 'usability'.

I mean, we do already have a dependence between pid namespaces and user
namespace etc., i.e. before you can join a pidns as an unpriv user you
need to join the userns. I think we can easily introduce a dependency
there. (Note also that a while back I extended setns() to take a pidfd
as an argument and you can now specify setns(pidfd, CLONE_NEWUSER |
CLONE_NEWPID | CLONE_NEWIMA).)

It's even worse in a sense since we can joing CLONE_NEWUSER in different
order depending whether we're privileging or devprivileging ourselves
(see the messy logic in nsenter and new*idmap for example). So there's
precedence for requiring dependencies between namespaces during setns().

> 
> I am somewhat sold on the USER namespace piggy backing thing... as suggested
> by James.
> 
> 
> > It will also make securityfs namespacing easier as it can be a keyed
> > super where the key is the ima ns (similar to how we deal with e.g.
> > mqueue).
> 
> Yes, mqueue is where I got the (API usage) idea from how to switch out the
> reference to the user namespace needed for the 'keyed' approach.
> 
> I will massage my 20 patch-series a bit more and then post it (for
> reference....). It does have 'somewhat dramatic' things in there that stem
> from virtualizing securityfs for example and IMA being a dependent of the
> user namespace and taking one more reference to the user namespace (it is a
> dependent of) and then the user namespace not being able to easily delete:
> 
> It's this here to help with an early tear-down to decrease the reference
> counter.
> 
> - https://github.com/stefanberger/linux-ima-namespaces/commit/1a5d7f2598764ca6f1a8c5a391672543fef83f2c
> 
> - https://github.com/stefanberger/linux-ima-namespaces/commit/d246f501f977e64333ecbd8bb79994e23b552b9b
> 
> - https://github.com/stefanberger/linux-ima-namespaces/commit/3b82058936862d7623b3a06bc1749d5efc018ab1#diff-99458ca9139231ac3811dbb0c0fced442c46c7cfdb94e86e4553fc0329d3a79bR647-R651
> 
> The teardown variable makes this a controlled process but ... is it
> acceptable?

I'll try to take a look later this week if that's ok.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 3/3] ima: make the integrity inode cache per namespace
  2021-11-30  5:03                     ` Serge E. Hallyn
  2021-11-30 11:55                       ` Stefan Berger
@ 2021-11-30 13:44                       ` Christian Brauner
  1 sibling, 0 replies; 46+ messages in thread
From: Christian Brauner @ 2021-11-30 13:44 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Stefan Berger, James Bottomley, linux-integrity, containers,
	Mimi Zohar, Dmitry Kasatkin, Eric W . Biederman,
	krzysztof.struczynski, Roberto Sassu, Michael Peters, Luke Hinds,
	Lily Sturmann, Patrick Uiterwijk, Christian Brauner

On Mon, Nov 29, 2021 at 11:03:17PM -0600, Serge Hallyn wrote:
> On Mon, Nov 29, 2021 at 12:04:29PM -0500, Stefan Berger wrote:
> > 
> > On 11/29/21 11:16, Christian Brauner wrote:
> > > On Mon, Nov 29, 2021 at 09:35:39AM -0600, Serge Hallyn wrote:
> > > > On Mon, Nov 29, 2021 at 09:46:55AM -0500, James Bottomley wrote:
> > > > > On Mon, 2021-11-29 at 15:22 +0100, Christian Brauner wrote:
> > > > > > On Mon, Nov 29, 2021 at 09:10:29AM -0500, James Bottomley wrote:
> > > > > > 
> > > I kept thinking about this question while I was out running and while I
> > > admittedly have reacted poorly to CLONE_NEWIMA patches before it feels
> > > to me that this is the right approach after all. Making it part of
> > > userns at least in this form isn't clean.
> > > 
> > > I think attaching a uuid to a userns alone for the sake of IMA is wrong.
> > > Additionally, I think a uuid only for the userns is too limited. This is
> > > similar to the problem of the audit container id. If we have some sort
> > > of uuid for ima it will automatically evolve into something like a
> > > container id (I'm not even arguing that this is necessarily wrong.).
> > > We also have the issue that we then have the container audit id thing -
> > > if this ever lands and the ima userns uuid. All that makes it quite
> > > messy.
> > > 
> > > I think CLONE_NEWIMA is ultimately nicer and allows the implementation
> > > to be decoupled from the userns and self-contained as possible. This
> > > also means that ima ns works for privileged containers which sure is a
> > > valid use-case.
> > 
> > The thing is that piggy backing on the user namespace at least allows us to
> > 'always see' where IMA's keyring is (across setns()). If we were using an
> > independent IMA namespace how would we guarantee that the user sees the
> > keyring for IMA appraisal? We would at least have to take a reference (as in
> > get_user_ns()) to the user namespace when the IMA namespace is created so
> > that it at least the association of IMA namespace to user namespace remains
> 
> Maybe we pull they keyring info into a new struct which is referred
> to and pinned by both user_ns and ima_ns?  (But I actually am ignorant
> of how ima is using the keyrings, so again I need to go do some reading.)

That's one way of doing it and we'd be able to shrink struct
user_namespace because of it if we have to go down that road anyway.

> 
> More moving parts isn't my first choice.  But if you need namespaced IMA
> for containers that aren't doing CLONE_NEWUSER, then a separate ima_ns is
> your only option.  Is that a requirement for you?

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2021-11-30 13:44 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-27 16:45 [RFC 0/3] Namespace IMA James Bottomley
2021-11-27 16:45 ` [RFC 1/3] userns: add uuid field James Bottomley
2021-11-28  4:45   ` Serge E. Hallyn
2021-11-28 13:29     ` James Bottomley
2021-11-28 15:18       ` Serge E. Hallyn
2021-11-28 18:00         ` James Bottomley
2021-11-28 20:47           ` Serge E. Hallyn
2021-11-28 21:21             ` James Bottomley
2021-11-28 21:49               ` Serge E. Hallyn
2021-11-28 22:56                 ` James Bottomley
2021-11-29  1:59                   ` Serge E. Hallyn
2021-11-29 13:49                     ` Stefan Berger
2021-11-29 13:56                       ` Christian Brauner
2021-11-29 14:19                         ` Stefan Berger
2021-11-30 13:09                         ` James Bottomley
2021-11-29 13:12                 ` Christian Brauner
2021-11-29 13:46                   ` James Bottomley
2021-11-27 16:45 ` [RFC 2/3] ima: Namespace IMA James Bottomley
2021-11-29  2:52   ` Serge E. Hallyn
2021-11-27 16:45 ` [RFC 3/3] ima: make the integrity inode cache per namespace James Bottomley
2021-11-29  4:58   ` Serge E. Hallyn
2021-11-29 12:50     ` James Bottomley
2021-11-29 13:53       ` Stefan Berger
2021-11-29 14:10         ` James Bottomley
2021-11-29 14:22           ` Christian Brauner
2021-11-29 14:46             ` James Bottomley
2021-11-29 15:27               ` Stefan Berger
2021-11-29 16:23                 ` James Bottomley
2021-11-29 15:35               ` Serge E. Hallyn
2021-11-29 16:07                 ` Stefan Berger
2021-11-30  4:42                   ` Serge E. Hallyn
2021-11-29 16:16                 ` Christian Brauner
2021-11-29 16:23                   ` Christian Brauner
2021-11-29 17:04                   ` Stefan Berger
2021-11-29 17:29                     ` James Bottomley
2021-11-30  5:03                     ` Serge E. Hallyn
2021-11-30 11:55                       ` Stefan Berger
2021-11-30 13:33                         ` Christian Brauner
2021-11-30 13:44                       ` Christian Brauner
2021-11-30 13:38                     ` Christian Brauner
2021-11-29 16:44                 ` James Bottomley
2021-11-30  4:59                   ` Serge E. Hallyn
2021-11-30 13:00                     ` James Bottomley
2021-11-29 14:30           ` Stefan Berger
2021-11-29 15:08             ` James Bottomley
2021-11-29 16:20             ` Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.