All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 18:59 ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, stefanb, vgoyal, amir73il,
	linux-security-module

This series of patches primary goal is to enable file capabilities
in user namespaces without affecting the file capabilities that are
effective on the host. This is to prevent that any unprivileged user
on the host maps his own uid to root in a private namespace, writes
the xattr, and executes the file with privilege on the host.

We achieve this goal by writing extended attributes with a different
name when a user namespace is used. If for example the root user
in a user namespace writes the security.capability xattr, the name
of the xattr that is actually written is encoded as
security.capability@uid=1000 for root mapped to uid 1000 on the host.
When listing the xattrs on the host, the existing security.capability
as well as the security.capability@uid=1000 will be shown. Inside the
namespace only 'security.capability', with the value of
security.capability@uid=1000, is visible.

To maintain compatibility with existing behavior, the value of
security.capability of the host is shown inside the user namespace
once the security.capability of the user namespace has been removed
(which really removes security.capability@uid=1000). Writing to
an extended attribute inside a user namespace effectively hides the
extended attribute of the host.

The general framework that is established with these patches can
be applied to other extended attributes as well, such as security.ima
or the 'trusted.' prefix . Another extended attribute that needed to
be enabled here is 'security.selinux,' since otherwise this extended
attribute would not be shown anymore inside a user namespace.

Regards,
   Stefan & Serge


Stefan Berger (3):
  xattr: Enable security.capability in user namespaces
  Enable capabilities of files from shared filesystem
  Enable security.selinux in user namespaces

 fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 +++-
 security/selinux/hooks.c |   9 +-
 3 files changed, 501 insertions(+), 16 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 18:59 ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: linux-security-module

This series of patches primary goal is to enable file capabilities
in user namespaces without affecting the file capabilities that are
effective on the host. This is to prevent that any unprivileged user
on the host maps his own uid to root in a private namespace, writes
the xattr, and executes the file with privilege on the host.

We achieve this goal by writing extended attributes with a different
name when a user namespace is used. If for example the root user
in a user namespace writes the security.capability xattr, the name
of the xattr that is actually written is encoded as
security.capability at uid=1000 for root mapped to uid 1000 on the host.
When listing the xattrs on the host, the existing security.capability
as well as the security.capability at uid=1000 will be shown. Inside the
namespace only 'security.capability', with the value of
security.capability at uid=1000, is visible.

To maintain compatibility with existing behavior, the value of
security.capability of the host is shown inside the user namespace
once the security.capability of the user namespace has been removed
(which really removes security.capability at uid=1000). Writing to
an extended attribute inside a user namespace effectively hides the
extended attribute of the host.

The general framework that is established with these patches can
be applied to other extended attributes as well, such as security.ima
or the 'trusted.' prefix . Another extended attribute that needed to
be enabled here is 'security.selinux,' since otherwise this extended
attribute would not be shown anymore inside a user namespace.

Regards,
   Stefan & Serge


Stefan Berger (3):
  xattr: Enable security.capability in user namespaces
  Enable capabilities of files from shared filesystem
  Enable security.selinux in user namespaces

 fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 +++-
 security/selinux/hooks.c |   9 +-
 3 files changed, 501 insertions(+), 16 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 18:59 ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2069 bytes --]

This series of patches primary goal is to enable file capabilities
in user namespaces without affecting the file capabilities that are
effective on the host. This is to prevent that any unprivileged user
on the host maps his own uid to root in a private namespace, writes
the xattr, and executes the file with privilege on the host.

We achieve this goal by writing extended attributes with a different
name when a user namespace is used. If for example the root user
in a user namespace writes the security.capability xattr, the name
of the xattr that is actually written is encoded as
security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
When listing the xattrs on the host, the existing security.capability
as well as the security.capability(a)uid=1000 will be shown. Inside the
namespace only 'security.capability', with the value of
security.capability(a)uid=1000, is visible.

To maintain compatibility with existing behavior, the value of
security.capability of the host is shown inside the user namespace
once the security.capability of the user namespace has been removed
(which really removes security.capability(a)uid=1000). Writing to
an extended attribute inside a user namespace effectively hides the
extended attribute of the host.

The general framework that is established with these patches can
be applied to other extended attributes as well, such as security.ima
or the 'trusted.' prefix . Another extended attribute that needed to
be enabled here is 'security.selinux,' since otherwise this extended
attribute would not be shown anymore inside a user namespace.

Regards,
   Stefan & Serge


Stefan Berger (3):
  xattr: Enable security.capability in user namespaces
  Enable capabilities of files from shared filesystem
  Enable security.selinux in user namespaces

 fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 +++-
 security/selinux/hooks.c |   9 +-
 3 files changed, 501 insertions(+), 16 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 1/3] xattr: Enable security.capability in user namespaces
       [not found] ` <1498157989-11814-1-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2017-06-22 18:59   ` Stefan Berger
  2017-06-22 18:59     ` Stefan Berger
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

This patch enables security.capability in user namespaces but also
takes a more general approach to enabling extended attributes in user
namespaces.

The following rules describe the approach using security.foo as a
'user namespace enabled' extended attribute:

Reading of extended attributes:

1) Reading security.foo from a user namespace will read
   security.foo@uid=<uid> of the parent user namespace instead with uid
   being the mapping of root in that parent user namespace. An
   exception is if root is mapped to uid 0 on the host, and in this case
   we will read security.foo directly.
   --> reading security.foo will read security.foo@uid=1000 for uid
       mapping of root to 1000.

2) All security.foo@uid=<uid> with valid uid mapping in the user namespace
   can be read. The uid within the user namespace will be mapped to the
   corresponding uid on the host and that uid will be used in the name of
   the extended attribute.
   -> reading security.foo@uid=1 will read security.foo@uid=1001 for uid
      mapping of root to 1000, size of at least 2.

   All security.foo@uid=<uid> can be read (by root) on the host with values
   of <uid> also being subject to checking for valid mappings.

3) No other security.foo* can be read.

The same rules for reading apply to writing and removing of user
namespace enabled extended attributes.

When listing extended attributes of a file, only those are presented
to the user namespace that have a valid mapping. Besides that, names
of the extended attributes are adjusted to represent the mapping.
This means that if root is mapped to uid 1000 on the host, the
security.foo@uid=1000 will be listed as security.foo in the user
namespace, security.foo@uid=1001 becomes security.foo@uid=1 and so on.

Signed-off-by: Stefan Berger <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Serge Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Serge Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
---
 fs/xattr.c               | 433 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 ++--
 security/selinux/hooks.c |   9 +-
 3 files changed, 462 insertions(+), 16 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 464c94b..64c4b40 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -133,11 +133,405 @@ xattr_permission(struct inode *inode, const char *name, int mask)
 	return inode_permission(inode, mask);
 }
 
+/*
+ * A list of extended attributes that are supported in user namespaces
+ */
+static const char *const userns_xattrs[] = {
+	XATTR_NAME_CAPS,
+	NULL
+};
+
+/*
+ * xattrs_is_userns_supported - Check whether an xattr is supported in userns
+ *
+ * @name:   full name of the extended attribute
+ * @prefix: do a prefix match (true) or a full match (false)
+ *
+ * This function returns < 0 if not supported, an index into userns_xattrs[]
+ * otherwise.
+ */
+static int
+xattr_is_userns_supported(const char *name, int prefix)
+{
+	int i;
+
+	if (!name)
+		return -1;
+
+	for (i = 0; userns_xattrs[i]; i++) {
+		if (prefix) {
+			if (!strncmp(userns_xattrs[i], name,
+				     strlen(userns_xattrs[i])))
+				return i;
+		} else {
+			if (!strcmp(userns_xattrs[i], name))
+				return i;
+		}
+	}
+	return -1;
+}
+
+/*
+ * xattr_write_uid - print a string in the format of "%s@uid=%u", which
+ *                   includes a prefix strig
+ *
+ * @uid:     the uid
+ * @prefix:  prefix string; may be NULL
+ *
+ * This function returns a buffer with the string, or a NULL pointer in
+ * case of out-of-memory error.
+ */
+static char *
+xattr_write_uid(uid_t uid, const char *prefix)
+{
+	size_t buflen;
+	char *buffer;
+
+	buflen = sizeof("@uid=") - 1 + sizeof("4294967295") - 1 + 1;
+	if (prefix)
+		buflen += strlen(prefix);
+
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer)
+		return NULL;
+
+	if (uid == 0)
+		*buffer = 0;
+	else
+		sprintf(buffer, "%s@uid=%u",
+			(prefix) ? prefix : "",
+			uid);
+
+	return buffer;
+}
+
+/*
+ * xattr_parse_uid_from_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @n_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_from_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t muid, p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	/* do we have a mapping of the uid? */
+	tuid = KUIDT_INIT(p_uid);
+	muid = from_kuid(userns, tuid);
+	if (muid == -1)
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(muid, NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_parse_uid_make_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @N_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_make_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	tuid = make_kuid(userns, p_uid);
+	if (!uid_valid(tuid))
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(__kuid_val(tuid), NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_rewrite_userns_xattr - Rewrite and filter an extended attribute
+ *                              considering user namespace uid mappings and
+ *                              user namespace support extended attributes
+ *
+ * @name: full name of the extended attribute
+ *
+ * This function returns NULL if the name is to be filtered. Otherwise it can
+ * return the input buffer or a new buffer that the caller needs to free. The
+ * new buffer contains a rewritten extended attribute whose string length may
+ * exceed that of the given name.
+ */
+static char *
+xattr_rewrite_userns_xattr(char *name)
+{
+	int idx, error;
+	size_t len = 0, buflen;
+	char *buffer, *n_uidstr;
+
+	/* prefix-match name against supported attributes */
+	idx = xattr_is_userns_supported(name, true);
+	if (idx < 0)
+		return NULL;
+
+	/* exact match ? */
+	len = strlen(userns_xattrs[idx]);
+	if (name[len] == 0)
+		return NULL;
+
+	/*
+	 * We must have a name[len] == '@'.
+	 */
+	error = xattr_parse_uid_from_kuid(&name[len], current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return NULL;
+
+	buflen = len + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	name[len] = 0;
+
+	snprintf(buffer, buflen, "%s%s", name, n_uidstr);
+
+	name[len] = '@';
+
+	kfree(n_uidstr);
+
+	return buffer;
+}
+
+/*
+ * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
+ *                             or determine needed size for attribute list
+ *                             in case size == 0
+ *
+ * In a user namespace we do not present all extended attributes to the
+ * user. We filter out those that are in the list of userns supported xattr.
+ * Besides that we filter out those with @uid=<uid> when there is no mapping
+ * for that uid in the current user namespace.
+ *
+ * @list:        list of 0-byte separated xattr names
+ * @size:        the size of the list; may be 0 to determine needed list size
+ * @list_maxlen: allocated buffer size of list
+ */
+static ssize_t
+xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
+{
+	char *nlist = NULL;
+	size_t s_off, len, nlen;
+	ssize_t d_off;
+	char *name, *newname;
+
+	if (!list || size < 0 || current_user_ns() == &init_user_ns)
+		return size;
+
+	if (size) {
+		nlist = kmalloc(list_maxlen, GFP_KERNEL);
+		if (!nlist)
+			return -ENOMEM;
+	}
+
+	s_off = d_off = 0;
+	while (s_off < size || size == 0) {
+		name = &list[s_off];
+
+		len = strlen(name);
+		if (!len)
+			break;
+
+		newname = xattr_rewrite_userns_xattr(name);
+		if (IS_ERR(newname)) {
+			d_off = PTR_ERR(newname);
+			goto out_free;
+		}
+		if (newname) {
+			nlen = strlen(newname);
+
+			if (nlist) {
+				if (nlen + 1 > list_maxlen)
+					break;
+				strcpy(&nlist[d_off], newname);
+			}
+
+			d_off += nlen + 1;
+			if (newname != name)
+				kfree(newname);
+		}
+		s_off += len + 1;
+	}
+	if (nlist)
+		memcpy(list, nlist, d_off);
+out_free:
+	kfree(nlist);
+
+	return d_off;
+}
+
+/*
+ * xattr_userns_name - modify the name of a user namespace supported
+ *                     extended attribute
+ *
+ * In a user namespace we prevent read/write accesses to the host's
+ * security.foo to protect these extended attributes.
+ *
+ * Reading:
+ * 1) Reading security.foo from a user namespace will read
+ *    security.foo@uid=<uid> of the parent user namespace instead with uid
+ *    being the mapping of root in that parent user namespace. An
+ *    exception is if root is mapped to uid 0 on the host, and in this case
+ *    we will read security.foo directly.
+ *    -> reading security.foo will read security.foo@uid=1000 for a uid
+ *       mapping of root to 1000.
+ *
+ * 2) All security.foo@uid=<uid> with valid uid mappings in the user namespace
+ *    an be read. The uid within the user namespace will be mapped to the
+ *    corresponding uid on the host and that uid will be used in the name of
+ *    the extended attribute.
+ *    -> reading security.foo@uid=1 will read security.foo@uid=1001 for a uid
+ *       mapping of root to 1000, size of at least 2.
+ *
+ *    All security.foo@uid=<uid> can be read (by root) on the host with values
+ *    of <uid> also being subject to checking for valid mappings.
+ *
+ * 3) No other security.foo* can be read.
+ *
+ * Writing and removing:
+ * The same rules for reading apply to writing and removing.
+ *
+ * This function returns a buffer with either the original name or the
+ * user namespace adjusted name of the extended attribute.
+ *
+ * @fullname: the full name of the extended attribute, e.g. security.foo
+ * @suffix:   the suffix of the extended attribute, e.g. foo
+ * @is_write: whether this is for writing an xattr
+ */
+char *
+xattr_userns_name(const char *fullname, const char *suffix)
+{
+	size_t buflen;
+	char *buffer, *ptr, *n_uidstr;
+	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	int idx, error;
+	size_t len = 0, slen;
+
+	if (!suffix)
+		return ERR_PTR(-EINVAL);
+
+	/* only security.foo will be changed here - prefix match here */
+	idx = xattr_is_userns_supported(fullname, true);
+	if (idx == -1)
+		goto out_copy;
+
+	/* read security.foo? --> read security.foo@uid=<uid> instead */
+	len = strlen(userns_xattrs[idx]);
+	if (fullname[len] == 0) {
+		/*
+		 * init_user_ns or userns with root mapped to uid 0
+		 * may read security.foo directly
+		 */
+		if (current_user_ns() == &init_user_ns ||
+		    __kuid_val(root_uid) == 0)
+			goto out_copy;
+
+		if (!uid_valid(root_uid))
+			return ERR_PTR(-EINVAL);
+
+		buffer = xattr_write_uid(__kuid_val(root_uid), suffix);
+		if (!buffer)
+			return ERR_PTR(-ENOMEM);
+
+		return buffer;
+	}
+
+	/*
+	 * We must have fullname[len] == '@'.
+	 */
+	error = xattr_parse_uid_make_kuid(&fullname[len],
+					  current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return ERR_PTR(error);
+
+	/* suffix of fullname must have '@' */
+	ptr = strchr(suffix, '@');
+	if (!ptr) {
+		kfree(n_uidstr);
+		goto err_eperm;
+	}
+	slen = ptr - suffix;
+
+	/* suffix[slen] = '@' */
+	buflen = strlen(suffix) + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	snprintf(buffer, slen + 1, "%s", suffix);
+	snprintf(&buffer[slen], buflen - slen, "%s", n_uidstr);
+	kfree(n_uidstr);
+
+	return buffer;
+
+out_copy:
+	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	if (!buffer)
+		return ERR_PTR(-ENOMEM);
+	strcpy(buffer, suffix);
+
+	return buffer;
+
+err_eperm:
+	return ERR_PTR(-EPERM);
+}
+
 int
 __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       const void *value, size_t size, int flags)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
@@ -146,7 +540,12 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	return handler->set(handler, dentry, inode, name, value, size, flags);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_setxattr);
 
@@ -302,13 +701,21 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       void *value, size_t size)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	return handler->get(handler, dentry, inode, name, value, size);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
 
@@ -328,8 +735,14 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 				XATTR_SECURITY_PREFIX_LEN)) {
+		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		int ret = xattr_getsecurity(inode, suffix, value, size);
+		char *nsuffix = xattr_userns_name(name, suffix);
+
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = xattr_getsecurity(inode, nsuffix, value, size);
+		kfree(nsuffix);
 		/*
 		 * Only overwrite the return value if a security module
 		 * is actually active.
@@ -360,6 +773,9 @@ vfs_listxattr(struct dentry *dentry, char *list, size_t size)
 		if (size && error > size)
 			error = -ERANGE;
 	}
+	if (error > 0)
+		error = xattr_list_userns_rewrite(list, error, size);
+
 	return error;
 }
 EXPORT_SYMBOL_GPL(vfs_listxattr);
@@ -369,13 +785,22 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 {
 	struct inode *inode = d_inode(dentry);
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	return handler->set(handler, dentry, inode, name, NULL, 0, XATTR_REPLACE);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
+			   XATTR_REPLACE);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_removexattr);
 
diff --git a/security/commoncap.c b/security/commoncap.c
index 7abebd7..c842690 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -660,15 +660,23 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
 int cap_inode_setxattr(struct dentry *dentry, const char *name,
 		       const void *value, size_t size, int flags)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
@@ -686,15 +694,23 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name,
  */
 int cap_inode_removexattr(struct dentry *dentry, const char *name)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 819fd68..702c225 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3091,8 +3091,13 @@ static int selinux_inode_setotherxattr(struct dentry *dentry, const char *name)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 		     sizeof XATTR_SECURITY_PREFIX - 1)) {
-		if (!strcmp(name, XATTR_NAME_CAPS)) {
-			if (!capable(CAP_SETFCAP))
+		if (!strncmp(name, XATTR_NAME_CAPS,
+			     sizeof(XATTR_NAME_CAPS) - 1)) {
+			struct inode *inode = d_backing_inode(dentry);
+
+			if (!inode)
+				return -EINVAL;
+			if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 				return -EPERM;
 		} else if (!capable(CAP_SYS_ADMIN)) {
 			/* A different attribute in the security namespace.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 1/3] xattr: Enable security.capability in user namespaces
  2017-06-22 18:59 ` Stefan Berger
  (?)
@ 2017-06-22 18:59   ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, stefanb, vgoyal, amir73il,
	linux-security-module

This patch enables security.capability in user namespaces but also
takes a more general approach to enabling extended attributes in user
namespaces.

The following rules describe the approach using security.foo as a
'user namespace enabled' extended attribute:

Reading of extended attributes:

1) Reading security.foo from a user namespace will read
   security.foo@uid=<uid> of the parent user namespace instead with uid
   being the mapping of root in that parent user namespace. An
   exception is if root is mapped to uid 0 on the host, and in this case
   we will read security.foo directly.
   --> reading security.foo will read security.foo@uid=1000 for uid
       mapping of root to 1000.

2) All security.foo@uid=<uid> with valid uid mapping in the user namespace
   can be read. The uid within the user namespace will be mapped to the
   corresponding uid on the host and that uid will be used in the name of
   the extended attribute.
   -> reading security.foo@uid=1 will read security.foo@uid=1001 for uid
      mapping of root to 1000, size of at least 2.

   All security.foo@uid=<uid> can be read (by root) on the host with values
   of <uid> also being subject to checking for valid mappings.

3) No other security.foo* can be read.

The same rules for reading apply to writing and removing of user
namespace enabled extended attributes.

When listing extended attributes of a file, only those are presented
to the user namespace that have a valid mapping. Besides that, names
of the extended attributes are adjusted to represent the mapping.
This means that if root is mapped to uid 1000 on the host, the
security.foo@uid=1000 will be listed as security.foo in the user
namespace, security.foo@uid=1001 becomes security.foo@uid=1 and so on.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
---
 fs/xattr.c               | 433 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 ++--
 security/selinux/hooks.c |   9 +-
 3 files changed, 462 insertions(+), 16 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 464c94b..64c4b40 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -133,11 +133,405 @@ xattr_permission(struct inode *inode, const char *name, int mask)
 	return inode_permission(inode, mask);
 }
 
+/*
+ * A list of extended attributes that are supported in user namespaces
+ */
+static const char *const userns_xattrs[] = {
+	XATTR_NAME_CAPS,
+	NULL
+};
+
+/*
+ * xattrs_is_userns_supported - Check whether an xattr is supported in userns
+ *
+ * @name:   full name of the extended attribute
+ * @prefix: do a prefix match (true) or a full match (false)
+ *
+ * This function returns < 0 if not supported, an index into userns_xattrs[]
+ * otherwise.
+ */
+static int
+xattr_is_userns_supported(const char *name, int prefix)
+{
+	int i;
+
+	if (!name)
+		return -1;
+
+	for (i = 0; userns_xattrs[i]; i++) {
+		if (prefix) {
+			if (!strncmp(userns_xattrs[i], name,
+				     strlen(userns_xattrs[i])))
+				return i;
+		} else {
+			if (!strcmp(userns_xattrs[i], name))
+				return i;
+		}
+	}
+	return -1;
+}
+
+/*
+ * xattr_write_uid - print a string in the format of "%s@uid=%u", which
+ *                   includes a prefix strig
+ *
+ * @uid:     the uid
+ * @prefix:  prefix string; may be NULL
+ *
+ * This function returns a buffer with the string, or a NULL pointer in
+ * case of out-of-memory error.
+ */
+static char *
+xattr_write_uid(uid_t uid, const char *prefix)
+{
+	size_t buflen;
+	char *buffer;
+
+	buflen = sizeof("@uid=") - 1 + sizeof("4294967295") - 1 + 1;
+	if (prefix)
+		buflen += strlen(prefix);
+
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer)
+		return NULL;
+
+	if (uid == 0)
+		*buffer = 0;
+	else
+		sprintf(buffer, "%s@uid=%u",
+			(prefix) ? prefix : "",
+			uid);
+
+	return buffer;
+}
+
+/*
+ * xattr_parse_uid_from_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @n_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_from_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t muid, p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	/* do we have a mapping of the uid? */
+	tuid = KUIDT_INIT(p_uid);
+	muid = from_kuid(userns, tuid);
+	if (muid == -1)
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(muid, NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_parse_uid_make_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @N_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_make_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	tuid = make_kuid(userns, p_uid);
+	if (!uid_valid(tuid))
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(__kuid_val(tuid), NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_rewrite_userns_xattr - Rewrite and filter an extended attribute
+ *                              considering user namespace uid mappings and
+ *                              user namespace support extended attributes
+ *
+ * @name: full name of the extended attribute
+ *
+ * This function returns NULL if the name is to be filtered. Otherwise it can
+ * return the input buffer or a new buffer that the caller needs to free. The
+ * new buffer contains a rewritten extended attribute whose string length may
+ * exceed that of the given name.
+ */
+static char *
+xattr_rewrite_userns_xattr(char *name)
+{
+	int idx, error;
+	size_t len = 0, buflen;
+	char *buffer, *n_uidstr;
+
+	/* prefix-match name against supported attributes */
+	idx = xattr_is_userns_supported(name, true);
+	if (idx < 0)
+		return NULL;
+
+	/* exact match ? */
+	len = strlen(userns_xattrs[idx]);
+	if (name[len] == 0)
+		return NULL;
+
+	/*
+	 * We must have a name[len] == '@'.
+	 */
+	error = xattr_parse_uid_from_kuid(&name[len], current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return NULL;
+
+	buflen = len + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	name[len] = 0;
+
+	snprintf(buffer, buflen, "%s%s", name, n_uidstr);
+
+	name[len] = '@';
+
+	kfree(n_uidstr);
+
+	return buffer;
+}
+
+/*
+ * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
+ *                             or determine needed size for attribute list
+ *                             in case size == 0
+ *
+ * In a user namespace we do not present all extended attributes to the
+ * user. We filter out those that are in the list of userns supported xattr.
+ * Besides that we filter out those with @uid=<uid> when there is no mapping
+ * for that uid in the current user namespace.
+ *
+ * @list:        list of 0-byte separated xattr names
+ * @size:        the size of the list; may be 0 to determine needed list size
+ * @list_maxlen: allocated buffer size of list
+ */
+static ssize_t
+xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
+{
+	char *nlist = NULL;
+	size_t s_off, len, nlen;
+	ssize_t d_off;
+	char *name, *newname;
+
+	if (!list || size < 0 || current_user_ns() == &init_user_ns)
+		return size;
+
+	if (size) {
+		nlist = kmalloc(list_maxlen, GFP_KERNEL);
+		if (!nlist)
+			return -ENOMEM;
+	}
+
+	s_off = d_off = 0;
+	while (s_off < size || size == 0) {
+		name = &list[s_off];
+
+		len = strlen(name);
+		if (!len)
+			break;
+
+		newname = xattr_rewrite_userns_xattr(name);
+		if (IS_ERR(newname)) {
+			d_off = PTR_ERR(newname);
+			goto out_free;
+		}
+		if (newname) {
+			nlen = strlen(newname);
+
+			if (nlist) {
+				if (nlen + 1 > list_maxlen)
+					break;
+				strcpy(&nlist[d_off], newname);
+			}
+
+			d_off += nlen + 1;
+			if (newname != name)
+				kfree(newname);
+		}
+		s_off += len + 1;
+	}
+	if (nlist)
+		memcpy(list, nlist, d_off);
+out_free:
+	kfree(nlist);
+
+	return d_off;
+}
+
+/*
+ * xattr_userns_name - modify the name of a user namespace supported
+ *                     extended attribute
+ *
+ * In a user namespace we prevent read/write accesses to the host's
+ * security.foo to protect these extended attributes.
+ *
+ * Reading:
+ * 1) Reading security.foo from a user namespace will read
+ *    security.foo@uid=<uid> of the parent user namespace instead with uid
+ *    being the mapping of root in that parent user namespace. An
+ *    exception is if root is mapped to uid 0 on the host, and in this case
+ *    we will read security.foo directly.
+ *    -> reading security.foo will read security.foo@uid=1000 for a uid
+ *       mapping of root to 1000.
+ *
+ * 2) All security.foo@uid=<uid> with valid uid mappings in the user namespace
+ *    an be read. The uid within the user namespace will be mapped to the
+ *    corresponding uid on the host and that uid will be used in the name of
+ *    the extended attribute.
+ *    -> reading security.foo@uid=1 will read security.foo@uid=1001 for a uid
+ *       mapping of root to 1000, size of at least 2.
+ *
+ *    All security.foo@uid=<uid> can be read (by root) on the host with values
+ *    of <uid> also being subject to checking for valid mappings.
+ *
+ * 3) No other security.foo* can be read.
+ *
+ * Writing and removing:
+ * The same rules for reading apply to writing and removing.
+ *
+ * This function returns a buffer with either the original name or the
+ * user namespace adjusted name of the extended attribute.
+ *
+ * @fullname: the full name of the extended attribute, e.g. security.foo
+ * @suffix:   the suffix of the extended attribute, e.g. foo
+ * @is_write: whether this is for writing an xattr
+ */
+char *
+xattr_userns_name(const char *fullname, const char *suffix)
+{
+	size_t buflen;
+	char *buffer, *ptr, *n_uidstr;
+	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	int idx, error;
+	size_t len = 0, slen;
+
+	if (!suffix)
+		return ERR_PTR(-EINVAL);
+
+	/* only security.foo will be changed here - prefix match here */
+	idx = xattr_is_userns_supported(fullname, true);
+	if (idx == -1)
+		goto out_copy;
+
+	/* read security.foo? --> read security.foo@uid=<uid> instead */
+	len = strlen(userns_xattrs[idx]);
+	if (fullname[len] == 0) {
+		/*
+		 * init_user_ns or userns with root mapped to uid 0
+		 * may read security.foo directly
+		 */
+		if (current_user_ns() == &init_user_ns ||
+		    __kuid_val(root_uid) == 0)
+			goto out_copy;
+
+		if (!uid_valid(root_uid))
+			return ERR_PTR(-EINVAL);
+
+		buffer = xattr_write_uid(__kuid_val(root_uid), suffix);
+		if (!buffer)
+			return ERR_PTR(-ENOMEM);
+
+		return buffer;
+	}
+
+	/*
+	 * We must have fullname[len] == '@'.
+	 */
+	error = xattr_parse_uid_make_kuid(&fullname[len],
+					  current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return ERR_PTR(error);
+
+	/* suffix of fullname must have '@' */
+	ptr = strchr(suffix, '@');
+	if (!ptr) {
+		kfree(n_uidstr);
+		goto err_eperm;
+	}
+	slen = ptr - suffix;
+
+	/* suffix[slen] = '@' */
+	buflen = strlen(suffix) + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	snprintf(buffer, slen + 1, "%s", suffix);
+	snprintf(&buffer[slen], buflen - slen, "%s", n_uidstr);
+	kfree(n_uidstr);
+
+	return buffer;
+
+out_copy:
+	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	if (!buffer)
+		return ERR_PTR(-ENOMEM);
+	strcpy(buffer, suffix);
+
+	return buffer;
+
+err_eperm:
+	return ERR_PTR(-EPERM);
+}
+
 int
 __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       const void *value, size_t size, int flags)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
@@ -146,7 +540,12 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	return handler->set(handler, dentry, inode, name, value, size, flags);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_setxattr);
 
@@ -302,13 +701,21 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       void *value, size_t size)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	return handler->get(handler, dentry, inode, name, value, size);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
 
@@ -328,8 +735,14 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 				XATTR_SECURITY_PREFIX_LEN)) {
+		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		int ret = xattr_getsecurity(inode, suffix, value, size);
+		char *nsuffix = xattr_userns_name(name, suffix);
+
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = xattr_getsecurity(inode, nsuffix, value, size);
+		kfree(nsuffix);
 		/*
 		 * Only overwrite the return value if a security module
 		 * is actually active.
@@ -360,6 +773,9 @@ vfs_listxattr(struct dentry *dentry, char *list, size_t size)
 		if (size && error > size)
 			error = -ERANGE;
 	}
+	if (error > 0)
+		error = xattr_list_userns_rewrite(list, error, size);
+
 	return error;
 }
 EXPORT_SYMBOL_GPL(vfs_listxattr);
@@ -369,13 +785,22 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 {
 	struct inode *inode = d_inode(dentry);
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	return handler->set(handler, dentry, inode, name, NULL, 0, XATTR_REPLACE);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
+			   XATTR_REPLACE);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_removexattr);
 
diff --git a/security/commoncap.c b/security/commoncap.c
index 7abebd7..c842690 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -660,15 +660,23 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
 int cap_inode_setxattr(struct dentry *dentry, const char *name,
 		       const void *value, size_t size, int flags)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
@@ -686,15 +694,23 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name,
  */
 int cap_inode_removexattr(struct dentry *dentry, const char *name)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 819fd68..702c225 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3091,8 +3091,13 @@ static int selinux_inode_setotherxattr(struct dentry *dentry, const char *name)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 		     sizeof XATTR_SECURITY_PREFIX - 1)) {
-		if (!strcmp(name, XATTR_NAME_CAPS)) {
-			if (!capable(CAP_SETFCAP))
+		if (!strncmp(name, XATTR_NAME_CAPS,
+			     sizeof(XATTR_NAME_CAPS) - 1)) {
+			struct inode *inode = d_backing_inode(dentry);
+
+			if (!inode)
+				return -EINVAL;
+			if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 				return -EPERM;
 		} else if (!capable(CAP_SYS_ADMIN)) {
 			/* A different attribute in the security namespace.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 1/3] xattr: Enable security.capability in user namespaces
@ 2017-06-22 18:59   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: linux-security-module

This patch enables security.capability in user namespaces but also
takes a more general approach to enabling extended attributes in user
namespaces.

The following rules describe the approach using security.foo as a
'user namespace enabled' extended attribute:

Reading of extended attributes:

1) Reading security.foo from a user namespace will read
   security.foo at uid=<uid> of the parent user namespace instead with uid
   being the mapping of root in that parent user namespace. An
   exception is if root is mapped to uid 0 on the host, and in this case
   we will read security.foo directly.
   --> reading security.foo will read security.foo at uid=1000 for uid
       mapping of root to 1000.

2) All security.foo at uid=<uid> with valid uid mapping in the user namespace
   can be read. The uid within the user namespace will be mapped to the
   corresponding uid on the host and that uid will be used in the name of
   the extended attribute.
   -> reading security.foo at uid=1 will read security.foo at uid=1001 for uid
      mapping of root to 1000, size of at least 2.

   All security.foo at uid=<uid> can be read (by root) on the host with values
   of <uid> also being subject to checking for valid mappings.

3) No other security.foo* can be read.

The same rules for reading apply to writing and removing of user
namespace enabled extended attributes.

When listing extended attributes of a file, only those are presented
to the user namespace that have a valid mapping. Besides that, names
of the extended attributes are adjusted to represent the mapping.
This means that if root is mapped to uid 1000 on the host, the
security.foo at uid=1000 will be listed as security.foo in the user
namespace, security.foo at uid=1001 becomes security.foo at uid=1 and so on.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
---
 fs/xattr.c               | 433 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 ++--
 security/selinux/hooks.c |   9 +-
 3 files changed, 462 insertions(+), 16 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 464c94b..64c4b40 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -133,11 +133,405 @@ xattr_permission(struct inode *inode, const char *name, int mask)
 	return inode_permission(inode, mask);
 }
 
+/*
+ * A list of extended attributes that are supported in user namespaces
+ */
+static const char *const userns_xattrs[] = {
+	XATTR_NAME_CAPS,
+	NULL
+};
+
+/*
+ * xattrs_is_userns_supported - Check whether an xattr is supported in userns
+ *
+ * @name:   full name of the extended attribute
+ * @prefix: do a prefix match (true) or a full match (false)
+ *
+ * This function returns < 0 if not supported, an index into userns_xattrs[]
+ * otherwise.
+ */
+static int
+xattr_is_userns_supported(const char *name, int prefix)
+{
+	int i;
+
+	if (!name)
+		return -1;
+
+	for (i = 0; userns_xattrs[i]; i++) {
+		if (prefix) {
+			if (!strncmp(userns_xattrs[i], name,
+				     strlen(userns_xattrs[i])))
+				return i;
+		} else {
+			if (!strcmp(userns_xattrs[i], name))
+				return i;
+		}
+	}
+	return -1;
+}
+
+/*
+ * xattr_write_uid - print a string in the format of "%s at uid=%u", which
+ *                   includes a prefix strig
+ *
+ * @uid:     the uid
+ * @prefix:  prefix string; may be NULL
+ *
+ * This function returns a buffer with the string, or a NULL pointer in
+ * case of out-of-memory error.
+ */
+static char *
+xattr_write_uid(uid_t uid, const char *prefix)
+{
+	size_t buflen;
+	char *buffer;
+
+	buflen = sizeof("@uid=") - 1 + sizeof("4294967295") - 1 + 1;
+	if (prefix)
+		buflen += strlen(prefix);
+
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer)
+		return NULL;
+
+	if (uid == 0)
+		*buffer = 0;
+	else
+		sprintf(buffer, "%s@uid=%u",
+			(prefix) ? prefix : "",
+			uid);
+
+	return buffer;
+}
+
+/*
+ * xattr_parse_uid_from_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @n_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_from_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t muid, p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	/* do we have a mapping of the uid? */
+	tuid = KUIDT_INIT(p_uid);
+	muid = from_kuid(userns, tuid);
+	if (muid == -1)
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(muid, NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_parse_uid_make_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @N_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_make_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	tuid = make_kuid(userns, p_uid);
+	if (!uid_valid(tuid))
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(__kuid_val(tuid), NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_rewrite_userns_xattr - Rewrite and filter an extended attribute
+ *                              considering user namespace uid mappings and
+ *                              user namespace support extended attributes
+ *
+ * @name: full name of the extended attribute
+ *
+ * This function returns NULL if the name is to be filtered. Otherwise it can
+ * return the input buffer or a new buffer that the caller needs to free. The
+ * new buffer contains a rewritten extended attribute whose string length may
+ * exceed that of the given name.
+ */
+static char *
+xattr_rewrite_userns_xattr(char *name)
+{
+	int idx, error;
+	size_t len = 0, buflen;
+	char *buffer, *n_uidstr;
+
+	/* prefix-match name against supported attributes */
+	idx = xattr_is_userns_supported(name, true);
+	if (idx < 0)
+		return NULL;
+
+	/* exact match ? */
+	len = strlen(userns_xattrs[idx]);
+	if (name[len] == 0)
+		return NULL;
+
+	/*
+	 * We must have a name[len] == '@'.
+	 */
+	error = xattr_parse_uid_from_kuid(&name[len], current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return NULL;
+
+	buflen = len + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	name[len] = 0;
+
+	snprintf(buffer, buflen, "%s%s", name, n_uidstr);
+
+	name[len] = '@';
+
+	kfree(n_uidstr);
+
+	return buffer;
+}
+
+/*
+ * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
+ *                             or determine needed size for attribute list
+ *                             in case size == 0
+ *
+ * In a user namespace we do not present all extended attributes to the
+ * user. We filter out those that are in the list of userns supported xattr.
+ * Besides that we filter out those with @uid=<uid> when there is no mapping
+ * for that uid in the current user namespace.
+ *
+ * @list:        list of 0-byte separated xattr names
+ * @size:        the size of the list; may be 0 to determine needed list size
+ * @list_maxlen: allocated buffer size of list
+ */
+static ssize_t
+xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
+{
+	char *nlist = NULL;
+	size_t s_off, len, nlen;
+	ssize_t d_off;
+	char *name, *newname;
+
+	if (!list || size < 0 || current_user_ns() == &init_user_ns)
+		return size;
+
+	if (size) {
+		nlist = kmalloc(list_maxlen, GFP_KERNEL);
+		if (!nlist)
+			return -ENOMEM;
+	}
+
+	s_off = d_off = 0;
+	while (s_off < size || size == 0) {
+		name = &list[s_off];
+
+		len = strlen(name);
+		if (!len)
+			break;
+
+		newname = xattr_rewrite_userns_xattr(name);
+		if (IS_ERR(newname)) {
+			d_off = PTR_ERR(newname);
+			goto out_free;
+		}
+		if (newname) {
+			nlen = strlen(newname);
+
+			if (nlist) {
+				if (nlen + 1 > list_maxlen)
+					break;
+				strcpy(&nlist[d_off], newname);
+			}
+
+			d_off += nlen + 1;
+			if (newname != name)
+				kfree(newname);
+		}
+		s_off += len + 1;
+	}
+	if (nlist)
+		memcpy(list, nlist, d_off);
+out_free:
+	kfree(nlist);
+
+	return d_off;
+}
+
+/*
+ * xattr_userns_name - modify the name of a user namespace supported
+ *                     extended attribute
+ *
+ * In a user namespace we prevent read/write accesses to the host's
+ * security.foo to protect these extended attributes.
+ *
+ * Reading:
+ * 1) Reading security.foo from a user namespace will read
+ *    security.foo at uid=<uid> of the parent user namespace instead with uid
+ *    being the mapping of root in that parent user namespace. An
+ *    exception is if root is mapped to uid 0 on the host, and in this case
+ *    we will read security.foo directly.
+ *    -> reading security.foo will read security.foo at uid=1000 for a uid
+ *       mapping of root to 1000.
+ *
+ * 2) All security.foo at uid=<uid> with valid uid mappings in the user namespace
+ *    an be read. The uid within the user namespace will be mapped to the
+ *    corresponding uid on the host and that uid will be used in the name of
+ *    the extended attribute.
+ *    -> reading security.foo at uid=1 will read security.foo at uid=1001 for a uid
+ *       mapping of root to 1000, size of at least 2.
+ *
+ *    All security.foo at uid=<uid> can be read (by root) on the host with values
+ *    of <uid> also being subject to checking for valid mappings.
+ *
+ * 3) No other security.foo* can be read.
+ *
+ * Writing and removing:
+ * The same rules for reading apply to writing and removing.
+ *
+ * This function returns a buffer with either the original name or the
+ * user namespace adjusted name of the extended attribute.
+ *
+ * @fullname: the full name of the extended attribute, e.g. security.foo
+ * @suffix:   the suffix of the extended attribute, e.g. foo
+ * @is_write: whether this is for writing an xattr
+ */
+char *
+xattr_userns_name(const char *fullname, const char *suffix)
+{
+	size_t buflen;
+	char *buffer, *ptr, *n_uidstr;
+	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	int idx, error;
+	size_t len = 0, slen;
+
+	if (!suffix)
+		return ERR_PTR(-EINVAL);
+
+	/* only security.foo will be changed here - prefix match here */
+	idx = xattr_is_userns_supported(fullname, true);
+	if (idx == -1)
+		goto out_copy;
+
+	/* read security.foo? --> read security.foo at uid=<uid> instead */
+	len = strlen(userns_xattrs[idx]);
+	if (fullname[len] == 0) {
+		/*
+		 * init_user_ns or userns with root mapped to uid 0
+		 * may read security.foo directly
+		 */
+		if (current_user_ns() == &init_user_ns ||
+		    __kuid_val(root_uid) == 0)
+			goto out_copy;
+
+		if (!uid_valid(root_uid))
+			return ERR_PTR(-EINVAL);
+
+		buffer = xattr_write_uid(__kuid_val(root_uid), suffix);
+		if (!buffer)
+			return ERR_PTR(-ENOMEM);
+
+		return buffer;
+	}
+
+	/*
+	 * We must have fullname[len] == '@'.
+	 */
+	error = xattr_parse_uid_make_kuid(&fullname[len],
+					  current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return ERR_PTR(error);
+
+	/* suffix of fullname must have '@' */
+	ptr = strchr(suffix, '@');
+	if (!ptr) {
+		kfree(n_uidstr);
+		goto err_eperm;
+	}
+	slen = ptr - suffix;
+
+	/* suffix[slen] = '@' */
+	buflen = strlen(suffix) + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	snprintf(buffer, slen + 1, "%s", suffix);
+	snprintf(&buffer[slen], buflen - slen, "%s", n_uidstr);
+	kfree(n_uidstr);
+
+	return buffer;
+
+out_copy:
+	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	if (!buffer)
+		return ERR_PTR(-ENOMEM);
+	strcpy(buffer, suffix);
+
+	return buffer;
+
+err_eperm:
+	return ERR_PTR(-EPERM);
+}
+
 int
 __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       const void *value, size_t size, int flags)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
@@ -146,7 +540,12 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	return handler->set(handler, dentry, inode, name, value, size, flags);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_setxattr);
 
@@ -302,13 +701,21 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       void *value, size_t size)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	return handler->get(handler, dentry, inode, name, value, size);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
 
@@ -328,8 +735,14 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 				XATTR_SECURITY_PREFIX_LEN)) {
+		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		int ret = xattr_getsecurity(inode, suffix, value, size);
+		char *nsuffix = xattr_userns_name(name, suffix);
+
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = xattr_getsecurity(inode, nsuffix, value, size);
+		kfree(nsuffix);
 		/*
 		 * Only overwrite the return value if a security module
 		 * is actually active.
@@ -360,6 +773,9 @@ vfs_listxattr(struct dentry *dentry, char *list, size_t size)
 		if (size && error > size)
 			error = -ERANGE;
 	}
+	if (error > 0)
+		error = xattr_list_userns_rewrite(list, error, size);
+
 	return error;
 }
 EXPORT_SYMBOL_GPL(vfs_listxattr);
@@ -369,13 +785,22 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 {
 	struct inode *inode = d_inode(dentry);
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	return handler->set(handler, dentry, inode, name, NULL, 0, XATTR_REPLACE);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
+			   XATTR_REPLACE);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_removexattr);
 
diff --git a/security/commoncap.c b/security/commoncap.c
index 7abebd7..c842690 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -660,15 +660,23 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
 int cap_inode_setxattr(struct dentry *dentry, const char *name,
 		       const void *value, size_t size, int flags)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
@@ -686,15 +694,23 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name,
  */
 int cap_inode_removexattr(struct dentry *dentry, const char *name)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 819fd68..702c225 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3091,8 +3091,13 @@ static int selinux_inode_setotherxattr(struct dentry *dentry, const char *name)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 		     sizeof XATTR_SECURITY_PREFIX - 1)) {
-		if (!strcmp(name, XATTR_NAME_CAPS)) {
-			if (!capable(CAP_SETFCAP))
+		if (!strncmp(name, XATTR_NAME_CAPS,
+			     sizeof(XATTR_NAME_CAPS) - 1)) {
+			struct inode *inode = d_backing_inode(dentry);
+
+			if (!inode)
+				return -EINVAL;
+			if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 				return -EPERM;
 		} else if (!capable(CAP_SYS_ADMIN)) {
 			/* A different attribute in the security namespace.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 1/3] xattr: Enable security.capability in user namespaces
@ 2017-06-22 18:59   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 18688 bytes --]

This patch enables security.capability in user namespaces but also
takes a more general approach to enabling extended attributes in user
namespaces.

The following rules describe the approach using security.foo as a
'user namespace enabled' extended attribute:

Reading of extended attributes:

1) Reading security.foo from a user namespace will read
   security.foo(a)uid=<uid> of the parent user namespace instead with uid
   being the mapping of root in that parent user namespace. An
   exception is if root is mapped to uid 0 on the host, and in this case
   we will read security.foo directly.
   --> reading security.foo will read security.foo(a)uid=1000 for uid
       mapping of root to 1000.

2) All security.foo(a)uid=<uid> with valid uid mapping in the user namespace
   can be read. The uid within the user namespace will be mapped to the
   corresponding uid on the host and that uid will be used in the name of
   the extended attribute.
   -> reading security.foo(a)uid=1 will read security.foo(a)uid=1001 for uid
      mapping of root to 1000, size of at least 2.

   All security.foo(a)uid=<uid> can be read (by root) on the host with values
   of <uid> also being subject to checking for valid mappings.

3) No other security.foo* can be read.

The same rules for reading apply to writing and removing of user
namespace enabled extended attributes.

When listing extended attributes of a file, only those are presented
to the user namespace that have a valid mapping. Besides that, names
of the extended attributes are adjusted to represent the mapping.
This means that if root is mapped to uid 1000 on the host, the
security.foo(a)uid=1000 will be listed as security.foo in the user
namespace, security.foo(a)uid=1001 becomes security.foo(a)uid=1 and so on.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
---
 fs/xattr.c               | 433 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 ++--
 security/selinux/hooks.c |   9 +-
 3 files changed, 462 insertions(+), 16 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 464c94b..64c4b40 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -133,11 +133,405 @@ xattr_permission(struct inode *inode, const char *name, int mask)
 	return inode_permission(inode, mask);
 }
 
+/*
+ * A list of extended attributes that are supported in user namespaces
+ */
+static const char *const userns_xattrs[] = {
+	XATTR_NAME_CAPS,
+	NULL
+};
+
+/*
+ * xattrs_is_userns_supported - Check whether an xattr is supported in userns
+ *
+ * @name:   full name of the extended attribute
+ * @prefix: do a prefix match (true) or a full match (false)
+ *
+ * This function returns < 0 if not supported, an index into userns_xattrs[]
+ * otherwise.
+ */
+static int
+xattr_is_userns_supported(const char *name, int prefix)
+{
+	int i;
+
+	if (!name)
+		return -1;
+
+	for (i = 0; userns_xattrs[i]; i++) {
+		if (prefix) {
+			if (!strncmp(userns_xattrs[i], name,
+				     strlen(userns_xattrs[i])))
+				return i;
+		} else {
+			if (!strcmp(userns_xattrs[i], name))
+				return i;
+		}
+	}
+	return -1;
+}
+
+/*
+ * xattr_write_uid - print a string in the format of "%s(a)uid=%u", which
+ *                   includes a prefix strig
+ *
+ * @uid:     the uid
+ * @prefix:  prefix string; may be NULL
+ *
+ * This function returns a buffer with the string, or a NULL pointer in
+ * case of out-of-memory error.
+ */
+static char *
+xattr_write_uid(uid_t uid, const char *prefix)
+{
+	size_t buflen;
+	char *buffer;
+
+	buflen = sizeof("@uid=") - 1 + sizeof("4294967295") - 1 + 1;
+	if (prefix)
+		buflen += strlen(prefix);
+
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer)
+		return NULL;
+
+	if (uid == 0)
+		*buffer = 0;
+	else
+		sprintf(buffer, "%s@uid=%u",
+			(prefix) ? prefix : "",
+			uid);
+
+	return buffer;
+}
+
+/*
+ * xattr_parse_uid_from_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @n_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_from_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t muid, p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	/* do we have a mapping of the uid? */
+	tuid = KUIDT_INIT(p_uid);
+	muid = from_kuid(userns, tuid);
+	if (muid == -1)
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(muid, NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_parse_uid_make_kuid - parse string in the format @uid=<uid>; consider
+ *                             user namespaces and check mappings
+ *
+ * @uidstr   : string in the format "@uid=<uid>"
+ * @userns   : the user namespace to consult for uid mappings
+ * @N_uidstr : returned pointer holding the rewritten @uid=<uid> string with
+ *             the uid remapped
+ *
+ * This function returns an error code or 0 in case of success. In case
+ * of success, 'n_uidstr' will hold a valid string.
+ */
+static int
+xattr_parse_uid_make_kuid(const char *uidstr, struct user_namespace *userns,
+			  char **n_uidstr)
+{
+	int n;
+	uid_t p_uid;
+	char d;
+	kuid_t tuid;
+
+	*n_uidstr = NULL;
+
+	n = sscanf(uidstr, "@uid=%u%c", &p_uid, &d);
+	if (n != 1)
+		return -EINVAL;
+
+	tuid = make_kuid(userns, p_uid);
+	if (!uid_valid(tuid))
+		return -ENOENT;
+
+	*n_uidstr = xattr_write_uid(__kuid_val(tuid), NULL);
+	if (!*n_uidstr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+/*
+ * xattr_rewrite_userns_xattr - Rewrite and filter an extended attribute
+ *                              considering user namespace uid mappings and
+ *                              user namespace support extended attributes
+ *
+ * @name: full name of the extended attribute
+ *
+ * This function returns NULL if the name is to be filtered. Otherwise it can
+ * return the input buffer or a new buffer that the caller needs to free. The
+ * new buffer contains a rewritten extended attribute whose string length may
+ * exceed that of the given name.
+ */
+static char *
+xattr_rewrite_userns_xattr(char *name)
+{
+	int idx, error;
+	size_t len = 0, buflen;
+	char *buffer, *n_uidstr;
+
+	/* prefix-match name against supported attributes */
+	idx = xattr_is_userns_supported(name, true);
+	if (idx < 0)
+		return NULL;
+
+	/* exact match ? */
+	len = strlen(userns_xattrs[idx]);
+	if (name[len] == 0)
+		return NULL;
+
+	/*
+	 * We must have a name[len] == '@'.
+	 */
+	error = xattr_parse_uid_from_kuid(&name[len], current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return NULL;
+
+	buflen = len + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	name[len] = 0;
+
+	snprintf(buffer, buflen, "%s%s", name, n_uidstr);
+
+	name[len] = '@';
+
+	kfree(n_uidstr);
+
+	return buffer;
+}
+
+/*
+ * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
+ *                             or determine needed size for attribute list
+ *                             in case size == 0
+ *
+ * In a user namespace we do not present all extended attributes to the
+ * user. We filter out those that are in the list of userns supported xattr.
+ * Besides that we filter out those with @uid=<uid> when there is no mapping
+ * for that uid in the current user namespace.
+ *
+ * @list:        list of 0-byte separated xattr names
+ * @size:        the size of the list; may be 0 to determine needed list size
+ * @list_maxlen: allocated buffer size of list
+ */
+static ssize_t
+xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
+{
+	char *nlist = NULL;
+	size_t s_off, len, nlen;
+	ssize_t d_off;
+	char *name, *newname;
+
+	if (!list || size < 0 || current_user_ns() == &init_user_ns)
+		return size;
+
+	if (size) {
+		nlist = kmalloc(list_maxlen, GFP_KERNEL);
+		if (!nlist)
+			return -ENOMEM;
+	}
+
+	s_off = d_off = 0;
+	while (s_off < size || size == 0) {
+		name = &list[s_off];
+
+		len = strlen(name);
+		if (!len)
+			break;
+
+		newname = xattr_rewrite_userns_xattr(name);
+		if (IS_ERR(newname)) {
+			d_off = PTR_ERR(newname);
+			goto out_free;
+		}
+		if (newname) {
+			nlen = strlen(newname);
+
+			if (nlist) {
+				if (nlen + 1 > list_maxlen)
+					break;
+				strcpy(&nlist[d_off], newname);
+			}
+
+			d_off += nlen + 1;
+			if (newname != name)
+				kfree(newname);
+		}
+		s_off += len + 1;
+	}
+	if (nlist)
+		memcpy(list, nlist, d_off);
+out_free:
+	kfree(nlist);
+
+	return d_off;
+}
+
+/*
+ * xattr_userns_name - modify the name of a user namespace supported
+ *                     extended attribute
+ *
+ * In a user namespace we prevent read/write accesses to the host's
+ * security.foo to protect these extended attributes.
+ *
+ * Reading:
+ * 1) Reading security.foo from a user namespace will read
+ *    security.foo(a)uid=<uid> of the parent user namespace instead with uid
+ *    being the mapping of root in that parent user namespace. An
+ *    exception is if root is mapped to uid 0 on the host, and in this case
+ *    we will read security.foo directly.
+ *    -> reading security.foo will read security.foo(a)uid=1000 for a uid
+ *       mapping of root to 1000.
+ *
+ * 2) All security.foo(a)uid=<uid> with valid uid mappings in the user namespace
+ *    an be read. The uid within the user namespace will be mapped to the
+ *    corresponding uid on the host and that uid will be used in the name of
+ *    the extended attribute.
+ *    -> reading security.foo(a)uid=1 will read security.foo(a)uid=1001 for a uid
+ *       mapping of root to 1000, size of at least 2.
+ *
+ *    All security.foo(a)uid=<uid> can be read (by root) on the host with values
+ *    of <uid> also being subject to checking for valid mappings.
+ *
+ * 3) No other security.foo* can be read.
+ *
+ * Writing and removing:
+ * The same rules for reading apply to writing and removing.
+ *
+ * This function returns a buffer with either the original name or the
+ * user namespace adjusted name of the extended attribute.
+ *
+ * @fullname: the full name of the extended attribute, e.g. security.foo
+ * @suffix:   the suffix of the extended attribute, e.g. foo
+ * @is_write: whether this is for writing an xattr
+ */
+char *
+xattr_userns_name(const char *fullname, const char *suffix)
+{
+	size_t buflen;
+	char *buffer, *ptr, *n_uidstr;
+	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	int idx, error;
+	size_t len = 0, slen;
+
+	if (!suffix)
+		return ERR_PTR(-EINVAL);
+
+	/* only security.foo will be changed here - prefix match here */
+	idx = xattr_is_userns_supported(fullname, true);
+	if (idx == -1)
+		goto out_copy;
+
+	/* read security.foo? --> read security.foo(a)uid=<uid> instead */
+	len = strlen(userns_xattrs[idx]);
+	if (fullname[len] == 0) {
+		/*
+		 * init_user_ns or userns with root mapped to uid 0
+		 * may read security.foo directly
+		 */
+		if (current_user_ns() == &init_user_ns ||
+		    __kuid_val(root_uid) == 0)
+			goto out_copy;
+
+		if (!uid_valid(root_uid))
+			return ERR_PTR(-EINVAL);
+
+		buffer = xattr_write_uid(__kuid_val(root_uid), suffix);
+		if (!buffer)
+			return ERR_PTR(-ENOMEM);
+
+		return buffer;
+	}
+
+	/*
+	 * We must have fullname[len] == '@'.
+	 */
+	error = xattr_parse_uid_make_kuid(&fullname[len],
+					  current_user_ns(),
+					  &n_uidstr);
+	if (error)
+		return ERR_PTR(error);
+
+	/* suffix of fullname must have '@' */
+	ptr = strchr(suffix, '@');
+	if (!ptr) {
+		kfree(n_uidstr);
+		goto err_eperm;
+	}
+	slen = ptr - suffix;
+
+	/* suffix[slen] = '@' */
+	buflen = strlen(suffix) + strlen(n_uidstr) + 1;
+	buffer = kmalloc(buflen, GFP_KERNEL);
+	if (!buffer) {
+		kfree(n_uidstr);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	snprintf(buffer, slen + 1, "%s", suffix);
+	snprintf(&buffer[slen], buflen - slen, "%s", n_uidstr);
+	kfree(n_uidstr);
+
+	return buffer;
+
+out_copy:
+	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	if (!buffer)
+		return ERR_PTR(-ENOMEM);
+	strcpy(buffer, suffix);
+
+	return buffer;
+
+err_eperm:
+	return ERR_PTR(-EPERM);
+}
+
 int
 __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       const void *value, size_t size, int flags)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
@@ -146,7 +540,12 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	return handler->set(handler, dentry, inode, name, value, size, flags);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_setxattr);
 
@@ -302,13 +701,21 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	       void *value, size_t size)
 {
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	return handler->get(handler, dentry, inode, name, value, size);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
 
@@ -328,8 +735,14 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 				XATTR_SECURITY_PREFIX_LEN)) {
+		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		int ret = xattr_getsecurity(inode, suffix, value, size);
+		char *nsuffix = xattr_userns_name(name, suffix);
+
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = xattr_getsecurity(inode, nsuffix, value, size);
+		kfree(nsuffix);
 		/*
 		 * Only overwrite the return value if a security module
 		 * is actually active.
@@ -360,6 +773,9 @@ vfs_listxattr(struct dentry *dentry, char *list, size_t size)
 		if (size && error > size)
 			error = -ERANGE;
 	}
+	if (error > 0)
+		error = xattr_list_userns_rewrite(list, error, size);
+
 	return error;
 }
 EXPORT_SYMBOL_GPL(vfs_listxattr);
@@ -369,13 +785,22 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 {
 	struct inode *inode = d_inode(dentry);
 	const struct xattr_handler *handler;
+	char *nsuffix;
+	const char *fullname = name;
+	int ret;
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	return handler->set(handler, dentry, inode, name, NULL, 0, XATTR_REPLACE);
+	nsuffix = xattr_userns_name(fullname, name);
+	if (IS_ERR(nsuffix))
+		return PTR_ERR(nsuffix);
+	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
+			   XATTR_REPLACE);
+	kfree(nsuffix);
+	return ret;
 }
 EXPORT_SYMBOL(__vfs_removexattr);
 
diff --git a/security/commoncap.c b/security/commoncap.c
index 7abebd7..c842690 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -660,15 +660,23 @@ int cap_bprm_secureexec(struct linux_binprm *bprm)
 int cap_inode_setxattr(struct dentry *dentry, const char *name,
 		       const void *value, size_t size, int flags)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
@@ -686,15 +694,23 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name,
  */
 int cap_inode_removexattr(struct dentry *dentry, const char *name)
 {
-	if (!strcmp(name, XATTR_NAME_CAPS)) {
-		if (!capable(CAP_SETFCAP))
+	if (strncmp(name, XATTR_SECURITY_PREFIX,
+		    sizeof(XATTR_SECURITY_PREFIX) - 1) != 0)
+		return 0;
+
+	if (strncmp(name, XATTR_NAME_CAPS,
+		    sizeof(XATTR_NAME_CAPS) - 1) == 0) {
+		struct inode *inode = d_backing_inode(dentry);
+
+		if (!inode)
+			return -EINVAL;
+		if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 			return -EPERM;
+
 		return 0;
 	}
 
-	if (!strncmp(name, XATTR_SECURITY_PREFIX,
-		     sizeof(XATTR_SECURITY_PREFIX) - 1) &&
-	    !capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 	return 0;
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 819fd68..702c225 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3091,8 +3091,13 @@ static int selinux_inode_setotherxattr(struct dentry *dentry, const char *name)
 
 	if (!strncmp(name, XATTR_SECURITY_PREFIX,
 		     sizeof XATTR_SECURITY_PREFIX - 1)) {
-		if (!strcmp(name, XATTR_NAME_CAPS)) {
-			if (!capable(CAP_SETFCAP))
+		if (!strncmp(name, XATTR_NAME_CAPS,
+			     sizeof(XATTR_NAME_CAPS) - 1)) {
+			struct inode *inode = d_backing_inode(dentry);
+
+			if (!inode)
+				return -EINVAL;
+			if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 				return -EPERM;
 		} else if (!capable(CAP_SYS_ADMIN)) {
 			/* A different attribute in the security namespace.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 2/3] Enable capabilities of files from shared filesystem
  2017-06-22 18:59 ` Stefan Berger
  (?)
  (?)
@ 2017-06-22 18:59     ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

The previous patch changed existing behavior in so far as the
capabilities of files from a shared filesystem or bind-mounted files
were hidden from a user namespace. This patch makes these capabilties
visible to the user namespace again, unless the user namespace has set
its own capabilities, which will hide them until those set by the
user namespace are removed.

Also the listing of xattrs is adjusted. To avoid double listing of
names of extended attributes, which can happen if the container and the
host for example have security.capability, we now check that we do not
add the same extended attribute to the list twice.

Signed-off-by: Stefan Berger <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Serge Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Serge Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
---
 fs/xattr.c | 90 ++++++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 64c4b40..045be85 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -339,6 +339,26 @@ xattr_rewrite_userns_xattr(char *name)
 }
 
 /*
+ * xattr_list_contains - check whether an xattr list already contains a needle
+ *
+ * @list    : 0-byte separated strings
+ * @listlen : length of the list
+ * @needle  : the needle to search for
+ */
+static int
+xattr_list_contains(const char *list, size_t listlen, const char *needle)
+{
+	size_t o = 0;
+
+	while (o < listlen) {
+		if (!strcmp(&list[o], needle))
+			return true;
+		o += strlen(&list[o]) + 1;
+	}
+	return false;
+}
+
+/*
  * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
  *                             or determine needed size for attribute list
  *                             in case size == 0
@@ -377,12 +397,16 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
 		if (!len)
 			break;
 
-		newname = xattr_rewrite_userns_xattr(name);
-		if (IS_ERR(newname)) {
-			d_off = PTR_ERR(newname);
-			goto out_free;
+		if (xattr_is_userns_supported(name, false) >= 0)
+			newname = name;
+		else {
+			newname = xattr_rewrite_userns_xattr(name);
+			if (IS_ERR(newname)) {
+				d_off = PTR_ERR(newname);
+				goto out_free;
+			}
 		}
-		if (newname) {
+		if (newname && !xattr_list_contains(nlist, d_off, newname)) {
 			nlen = strlen(newname);
 
 			if (nlist) {
@@ -413,13 +437,19 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * security.foo to protect these extended attributes.
  *
  * Reading:
- * 1) Reading security.foo from a user namespace will read
- *    security.foo@uid=<uid> of the parent user namespace instead with uid
- *    being the mapping of root in that parent user namespace. An
- *    exception is if root is mapped to uid 0 on the host, and in this case
- *    we will read security.foo directly.
- *    -> reading security.foo will read security.foo@uid=1000 for a uid
- *       mapping of root to 1000.
+ * 1a) Reading security.foo from a user namespace will read
+ *     security.foo@uid=<uid> of the parent user namespace instead with uid
+ *     being the mapping of root in that parent user namespace. An
+ *     exception is if root is mapped to uid 0 on the host, and in this case
+ *     we will read security.foo directly.
+ *     -> reading security.foo will read security.foo@uid=1000 for a uid
+ *        mapping of root to 1000.
+ *
+ * 1b) If security.foo@uid=<uid> is not available, the security.foo of the
+ *     parent namespace is tried to be read. This procedure is repeated up to
+ *     the init user namespace. This step only applies for reading of extended
+ *     attributes and provides the same behavior as older systems where the
+ *     host's extended attributes applied to user namespaces.
  *
  * 2) All security.foo@uid=<uid> with valid uid mappings in the user namespace
  *    an be read. The uid within the user namespace will be mapped to the
@@ -434,7 +464,7 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * 3) No other security.foo* can be read.
  *
  * Writing and removing:
- * The same rules for reading apply to writing and removing.
+ * The same rules for reading apply to writing and removing, except for 1b).
  *
  * This function returns a buffer with either the original name or the
  * user namespace adjusted name of the extended attribute.
@@ -444,11 +474,12 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * @is_write: whether this is for writing an xattr
  */
 char *
-xattr_userns_name(const char *fullname, const char *suffix)
+xattr_userns_name(const char *fullname, const char *suffix,
+		  struct user_namespace *userns)
 {
 	size_t buflen;
 	char *buffer, *ptr, *n_uidstr;
-	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	kuid_t root_uid = make_kuid(userns, 0);
 	int idx, error;
 	size_t len = 0, slen;
 
@@ -467,7 +498,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 		 * init_user_ns or userns with root mapped to uid 0
 		 * may read security.foo directly
 		 */
-		if (current_user_ns() == &init_user_ns ||
+		if (userns == &init_user_ns ||
 		    __kuid_val(root_uid) == 0)
 			goto out_copy;
 
@@ -485,7 +516,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 	 * We must have fullname[len] == '@'.
 	 */
 	error = xattr_parse_uid_make_kuid(&fullname[len],
-					  current_user_ns(),
+					  userns,
 					  &n_uidstr);
 	if (error)
 		return ERR_PTR(error);
@@ -540,7 +571,7 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
@@ -703,18 +734,24 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	const struct xattr_handler *handler;
 	char *nsuffix;
 	const char *fullname = name;
-	int ret;
+	int ret, userns_supt_xattr;
+	struct user_namespace *userns = current_user_ns();
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
-	if (IS_ERR(nsuffix))
-		return PTR_ERR(nsuffix);
-	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
-	kfree(nsuffix);
+	userns_supt_xattr = (xattr_is_userns_supported(fullname, false) >= 0);
+	do {
+		nsuffix = xattr_userns_name(fullname, name, userns);
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = handler->get(handler, dentry, inode, nsuffix, value,
+				   size);
+		kfree(nsuffix);
+		userns = userns->parent;
+	} while ((ret == -ENODATA) && userns && userns_supt_xattr);
 	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
@@ -737,7 +774,8 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 				XATTR_SECURITY_PREFIX_LEN)) {
 		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		char *nsuffix = xattr_userns_name(name, suffix);
+		char *nsuffix = xattr_userns_name(name, suffix,
+						  current_user_ns());
 
 		if (IS_ERR(nsuffix))
 			return PTR_ERR(nsuffix);
@@ -794,7 +832,7 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 2/3] Enable capabilities of files from shared filesystem
@ 2017-06-22 18:59     ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, stefanb, vgoyal, amir73il,
	linux-security-module

The previous patch changed existing behavior in so far as the
capabilities of files from a shared filesystem or bind-mounted files
were hidden from a user namespace. This patch makes these capabilties
visible to the user namespace again, unless the user namespace has set
its own capabilities, which will hide them until those set by the
user namespace are removed.

Also the listing of xattrs is adjusted. To avoid double listing of
names of extended attributes, which can happen if the container and the
host for example have security.capability, we now check that we do not
add the same extended attribute to the list twice.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
---
 fs/xattr.c | 90 ++++++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 64c4b40..045be85 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -339,6 +339,26 @@ xattr_rewrite_userns_xattr(char *name)
 }
 
 /*
+ * xattr_list_contains - check whether an xattr list already contains a needle
+ *
+ * @list    : 0-byte separated strings
+ * @listlen : length of the list
+ * @needle  : the needle to search for
+ */
+static int
+xattr_list_contains(const char *list, size_t listlen, const char *needle)
+{
+	size_t o = 0;
+
+	while (o < listlen) {
+		if (!strcmp(&list[o], needle))
+			return true;
+		o += strlen(&list[o]) + 1;
+	}
+	return false;
+}
+
+/*
  * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
  *                             or determine needed size for attribute list
  *                             in case size == 0
@@ -377,12 +397,16 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
 		if (!len)
 			break;
 
-		newname = xattr_rewrite_userns_xattr(name);
-		if (IS_ERR(newname)) {
-			d_off = PTR_ERR(newname);
-			goto out_free;
+		if (xattr_is_userns_supported(name, false) >= 0)
+			newname = name;
+		else {
+			newname = xattr_rewrite_userns_xattr(name);
+			if (IS_ERR(newname)) {
+				d_off = PTR_ERR(newname);
+				goto out_free;
+			}
 		}
-		if (newname) {
+		if (newname && !xattr_list_contains(nlist, d_off, newname)) {
 			nlen = strlen(newname);
 
 			if (nlist) {
@@ -413,13 +437,19 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * security.foo to protect these extended attributes.
  *
  * Reading:
- * 1) Reading security.foo from a user namespace will read
- *    security.foo@uid=<uid> of the parent user namespace instead with uid
- *    being the mapping of root in that parent user namespace. An
- *    exception is if root is mapped to uid 0 on the host, and in this case
- *    we will read security.foo directly.
- *    -> reading security.foo will read security.foo@uid=1000 for a uid
- *       mapping of root to 1000.
+ * 1a) Reading security.foo from a user namespace will read
+ *     security.foo@uid=<uid> of the parent user namespace instead with uid
+ *     being the mapping of root in that parent user namespace. An
+ *     exception is if root is mapped to uid 0 on the host, and in this case
+ *     we will read security.foo directly.
+ *     -> reading security.foo will read security.foo@uid=1000 for a uid
+ *        mapping of root to 1000.
+ *
+ * 1b) If security.foo@uid=<uid> is not available, the security.foo of the
+ *     parent namespace is tried to be read. This procedure is repeated up to
+ *     the init user namespace. This step only applies for reading of extended
+ *     attributes and provides the same behavior as older systems where the
+ *     host's extended attributes applied to user namespaces.
  *
  * 2) All security.foo@uid=<uid> with valid uid mappings in the user namespace
  *    an be read. The uid within the user namespace will be mapped to the
@@ -434,7 +464,7 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * 3) No other security.foo* can be read.
  *
  * Writing and removing:
- * The same rules for reading apply to writing and removing.
+ * The same rules for reading apply to writing and removing, except for 1b).
  *
  * This function returns a buffer with either the original name or the
  * user namespace adjusted name of the extended attribute.
@@ -444,11 +474,12 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * @is_write: whether this is for writing an xattr
  */
 char *
-xattr_userns_name(const char *fullname, const char *suffix)
+xattr_userns_name(const char *fullname, const char *suffix,
+		  struct user_namespace *userns)
 {
 	size_t buflen;
 	char *buffer, *ptr, *n_uidstr;
-	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	kuid_t root_uid = make_kuid(userns, 0);
 	int idx, error;
 	size_t len = 0, slen;
 
@@ -467,7 +498,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 		 * init_user_ns or userns with root mapped to uid 0
 		 * may read security.foo directly
 		 */
-		if (current_user_ns() == &init_user_ns ||
+		if (userns == &init_user_ns ||
 		    __kuid_val(root_uid) == 0)
 			goto out_copy;
 
@@ -485,7 +516,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 	 * We must have fullname[len] == '@'.
 	 */
 	error = xattr_parse_uid_make_kuid(&fullname[len],
-					  current_user_ns(),
+					  userns,
 					  &n_uidstr);
 	if (error)
 		return ERR_PTR(error);
@@ -540,7 +571,7 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
@@ -703,18 +734,24 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	const struct xattr_handler *handler;
 	char *nsuffix;
 	const char *fullname = name;
-	int ret;
+	int ret, userns_supt_xattr;
+	struct user_namespace *userns = current_user_ns();
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
-	if (IS_ERR(nsuffix))
-		return PTR_ERR(nsuffix);
-	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
-	kfree(nsuffix);
+	userns_supt_xattr = (xattr_is_userns_supported(fullname, false) >= 0);
+	do {
+		nsuffix = xattr_userns_name(fullname, name, userns);
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = handler->get(handler, dentry, inode, nsuffix, value,
+				   size);
+		kfree(nsuffix);
+		userns = userns->parent;
+	} while ((ret == -ENODATA) && userns && userns_supt_xattr);
 	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
@@ -737,7 +774,8 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 				XATTR_SECURITY_PREFIX_LEN)) {
 		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		char *nsuffix = xattr_userns_name(name, suffix);
+		char *nsuffix = xattr_userns_name(name, suffix,
+						  current_user_ns());
 
 		if (IS_ERR(nsuffix))
 			return PTR_ERR(nsuffix);
@@ -794,7 +832,7 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 2/3] Enable capabilities of files from shared filesystem
@ 2017-06-22 18:59     ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: linux-security-module

The previous patch changed existing behavior in so far as the
capabilities of files from a shared filesystem or bind-mounted files
were hidden from a user namespace. This patch makes these capabilties
visible to the user namespace again, unless the user namespace has set
its own capabilities, which will hide them until those set by the
user namespace are removed.

Also the listing of xattrs is adjusted. To avoid double listing of
names of extended attributes, which can happen if the container and the
host for example have security.capability, we now check that we do not
add the same extended attribute to the list twice.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
---
 fs/xattr.c | 90 ++++++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 64c4b40..045be85 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -339,6 +339,26 @@ xattr_rewrite_userns_xattr(char *name)
 }
 
 /*
+ * xattr_list_contains - check whether an xattr list already contains a needle
+ *
+ * @list    : 0-byte separated strings
+ * @listlen : length of the list
+ * @needle  : the needle to search for
+ */
+static int
+xattr_list_contains(const char *list, size_t listlen, const char *needle)
+{
+	size_t o = 0;
+
+	while (o < listlen) {
+		if (!strcmp(&list[o], needle))
+			return true;
+		o += strlen(&list[o]) + 1;
+	}
+	return false;
+}
+
+/*
  * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
  *                             or determine needed size for attribute list
  *                             in case size == 0
@@ -377,12 +397,16 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
 		if (!len)
 			break;
 
-		newname = xattr_rewrite_userns_xattr(name);
-		if (IS_ERR(newname)) {
-			d_off = PTR_ERR(newname);
-			goto out_free;
+		if (xattr_is_userns_supported(name, false) >= 0)
+			newname = name;
+		else {
+			newname = xattr_rewrite_userns_xattr(name);
+			if (IS_ERR(newname)) {
+				d_off = PTR_ERR(newname);
+				goto out_free;
+			}
 		}
-		if (newname) {
+		if (newname && !xattr_list_contains(nlist, d_off, newname)) {
 			nlen = strlen(newname);
 
 			if (nlist) {
@@ -413,13 +437,19 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * security.foo to protect these extended attributes.
  *
  * Reading:
- * 1) Reading security.foo from a user namespace will read
- *    security.foo at uid=<uid> of the parent user namespace instead with uid
- *    being the mapping of root in that parent user namespace. An
- *    exception is if root is mapped to uid 0 on the host, and in this case
- *    we will read security.foo directly.
- *    -> reading security.foo will read security.foo at uid=1000 for a uid
- *       mapping of root to 1000.
+ * 1a) Reading security.foo from a user namespace will read
+ *     security.foo at uid=<uid> of the parent user namespace instead with uid
+ *     being the mapping of root in that parent user namespace. An
+ *     exception is if root is mapped to uid 0 on the host, and in this case
+ *     we will read security.foo directly.
+ *     -> reading security.foo will read security.foo at uid=1000 for a uid
+ *        mapping of root to 1000.
+ *
+ * 1b) If security.foo at uid=<uid> is not available, the security.foo of the
+ *     parent namespace is tried to be read. This procedure is repeated up to
+ *     the init user namespace. This step only applies for reading of extended
+ *     attributes and provides the same behavior as older systems where the
+ *     host's extended attributes applied to user namespaces.
  *
  * 2) All security.foo at uid=<uid> with valid uid mappings in the user namespace
  *    an be read. The uid within the user namespace will be mapped to the
@@ -434,7 +464,7 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * 3) No other security.foo* can be read.
  *
  * Writing and removing:
- * The same rules for reading apply to writing and removing.
+ * The same rules for reading apply to writing and removing, except for 1b).
  *
  * This function returns a buffer with either the original name or the
  * user namespace adjusted name of the extended attribute.
@@ -444,11 +474,12 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * @is_write: whether this is for writing an xattr
  */
 char *
-xattr_userns_name(const char *fullname, const char *suffix)
+xattr_userns_name(const char *fullname, const char *suffix,
+		  struct user_namespace *userns)
 {
 	size_t buflen;
 	char *buffer, *ptr, *n_uidstr;
-	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	kuid_t root_uid = make_kuid(userns, 0);
 	int idx, error;
 	size_t len = 0, slen;
 
@@ -467,7 +498,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 		 * init_user_ns or userns with root mapped to uid 0
 		 * may read security.foo directly
 		 */
-		if (current_user_ns() == &init_user_ns ||
+		if (userns == &init_user_ns ||
 		    __kuid_val(root_uid) == 0)
 			goto out_copy;
 
@@ -485,7 +516,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 	 * We must have fullname[len] == '@'.
 	 */
 	error = xattr_parse_uid_make_kuid(&fullname[len],
-					  current_user_ns(),
+					  userns,
 					  &n_uidstr);
 	if (error)
 		return ERR_PTR(error);
@@ -540,7 +571,7 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
@@ -703,18 +734,24 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	const struct xattr_handler *handler;
 	char *nsuffix;
 	const char *fullname = name;
-	int ret;
+	int ret, userns_supt_xattr;
+	struct user_namespace *userns = current_user_ns();
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
-	if (IS_ERR(nsuffix))
-		return PTR_ERR(nsuffix);
-	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
-	kfree(nsuffix);
+	userns_supt_xattr = (xattr_is_userns_supported(fullname, false) >= 0);
+	do {
+		nsuffix = xattr_userns_name(fullname, name, userns);
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = handler->get(handler, dentry, inode, nsuffix, value,
+				   size);
+		kfree(nsuffix);
+		userns = userns->parent;
+	} while ((ret == -ENODATA) && userns && userns_supt_xattr);
 	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
@@ -737,7 +774,8 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 				XATTR_SECURITY_PREFIX_LEN)) {
 		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		char *nsuffix = xattr_userns_name(name, suffix);
+		char *nsuffix = xattr_userns_name(name, suffix,
+						  current_user_ns());
 
 		if (IS_ERR(nsuffix))
 			return PTR_ERR(nsuffix);
@@ -794,7 +832,7 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 2/3] Enable capabilities of files from shared filesystem
@ 2017-06-22 18:59     ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 7892 bytes --]

The previous patch changed existing behavior in so far as the
capabilities of files from a shared filesystem or bind-mounted files
were hidden from a user namespace. This patch makes these capabilties
visible to the user namespace again, unless the user namespace has set
its own capabilities, which will hide them until those set by the
user namespace are removed.

Also the listing of xattrs is adjusted. To avoid double listing of
names of extended attributes, which can happen if the container and the
host for example have security.capability, we now check that we do not
add the same extended attribute to the list twice.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
---
 fs/xattr.c | 90 ++++++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 64 insertions(+), 26 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 64c4b40..045be85 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -339,6 +339,26 @@ xattr_rewrite_userns_xattr(char *name)
 }
 
 /*
+ * xattr_list_contains - check whether an xattr list already contains a needle
+ *
+ * @list    : 0-byte separated strings
+ * @listlen : length of the list
+ * @needle  : the needle to search for
+ */
+static int
+xattr_list_contains(const char *list, size_t listlen, const char *needle)
+{
+	size_t o = 0;
+
+	while (o < listlen) {
+		if (!strcmp(&list[o], needle))
+			return true;
+		o += strlen(&list[o]) + 1;
+	}
+	return false;
+}
+
+/*
  * xattr_list_userns_rewrite - Rewrite list of xattr names for user namespaces
  *                             or determine needed size for attribute list
  *                             in case size == 0
@@ -377,12 +397,16 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
 		if (!len)
 			break;
 
-		newname = xattr_rewrite_userns_xattr(name);
-		if (IS_ERR(newname)) {
-			d_off = PTR_ERR(newname);
-			goto out_free;
+		if (xattr_is_userns_supported(name, false) >= 0)
+			newname = name;
+		else {
+			newname = xattr_rewrite_userns_xattr(name);
+			if (IS_ERR(newname)) {
+				d_off = PTR_ERR(newname);
+				goto out_free;
+			}
 		}
-		if (newname) {
+		if (newname && !xattr_list_contains(nlist, d_off, newname)) {
 			nlen = strlen(newname);
 
 			if (nlist) {
@@ -413,13 +437,19 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * security.foo to protect these extended attributes.
  *
  * Reading:
- * 1) Reading security.foo from a user namespace will read
- *    security.foo(a)uid=<uid> of the parent user namespace instead with uid
- *    being the mapping of root in that parent user namespace. An
- *    exception is if root is mapped to uid 0 on the host, and in this case
- *    we will read security.foo directly.
- *    -> reading security.foo will read security.foo(a)uid=1000 for a uid
- *       mapping of root to 1000.
+ * 1a) Reading security.foo from a user namespace will read
+ *     security.foo(a)uid=<uid> of the parent user namespace instead with uid
+ *     being the mapping of root in that parent user namespace. An
+ *     exception is if root is mapped to uid 0 on the host, and in this case
+ *     we will read security.foo directly.
+ *     -> reading security.foo will read security.foo(a)uid=1000 for a uid
+ *        mapping of root to 1000.
+ *
+ * 1b) If security.foo(a)uid=<uid> is not available, the security.foo of the
+ *     parent namespace is tried to be read. This procedure is repeated up to
+ *     the init user namespace. This step only applies for reading of extended
+ *     attributes and provides the same behavior as older systems where the
+ *     host's extended attributes applied to user namespaces.
  *
  * 2) All security.foo(a)uid=<uid> with valid uid mappings in the user namespace
  *    an be read. The uid within the user namespace will be mapped to the
@@ -434,7 +464,7 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * 3) No other security.foo* can be read.
  *
  * Writing and removing:
- * The same rules for reading apply to writing and removing.
+ * The same rules for reading apply to writing and removing, except for 1b).
  *
  * This function returns a buffer with either the original name or the
  * user namespace adjusted name of the extended attribute.
@@ -444,11 +474,12 @@ xattr_list_userns_rewrite(char *list, ssize_t size, size_t list_maxlen)
  * @is_write: whether this is for writing an xattr
  */
 char *
-xattr_userns_name(const char *fullname, const char *suffix)
+xattr_userns_name(const char *fullname, const char *suffix,
+		  struct user_namespace *userns)
 {
 	size_t buflen;
 	char *buffer, *ptr, *n_uidstr;
-	kuid_t root_uid = make_kuid(current_user_ns(), 0);
+	kuid_t root_uid = make_kuid(userns, 0);
 	int idx, error;
 	size_t len = 0, slen;
 
@@ -467,7 +498,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 		 * init_user_ns or userns with root mapped to uid 0
 		 * may read security.foo directly
 		 */
-		if (current_user_ns() == &init_user_ns ||
+		if (userns == &init_user_ns ||
 		    __kuid_val(root_uid) == 0)
 			goto out_copy;
 
@@ -485,7 +516,7 @@ xattr_userns_name(const char *fullname, const char *suffix)
 	 * We must have fullname[len] == '@'.
 	 */
 	error = xattr_parse_uid_make_kuid(&fullname[len],
-					  current_user_ns(),
+					  userns,
 					  &n_uidstr);
 	if (error)
 		return ERR_PTR(error);
@@ -540,7 +571,7 @@ __vfs_setxattr(struct dentry *dentry, struct inode *inode, const char *name,
 		return -EOPNOTSUPP;
 	if (size == 0)
 		value = "";  /* empty EA, do not remove */
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, value, size, flags);
@@ -703,18 +734,24 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	const struct xattr_handler *handler;
 	char *nsuffix;
 	const char *fullname = name;
-	int ret;
+	int ret, userns_supt_xattr;
+	struct user_namespace *userns = current_user_ns();
 
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
 	if (!handler->get)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
-	if (IS_ERR(nsuffix))
-		return PTR_ERR(nsuffix);
-	ret = handler->get(handler, dentry, inode, nsuffix, value, size);
-	kfree(nsuffix);
+	userns_supt_xattr = (xattr_is_userns_supported(fullname, false) >= 0);
+	do {
+		nsuffix = xattr_userns_name(fullname, name, userns);
+		if (IS_ERR(nsuffix))
+			return PTR_ERR(nsuffix);
+		ret = handler->get(handler, dentry, inode, nsuffix, value,
+				   size);
+		kfree(nsuffix);
+		userns = userns->parent;
+	} while ((ret == -ENODATA) && userns && userns_supt_xattr);
 	return ret;
 }
 EXPORT_SYMBOL(__vfs_getxattr);
@@ -737,7 +774,8 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 				XATTR_SECURITY_PREFIX_LEN)) {
 		int ret;
 		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		char *nsuffix = xattr_userns_name(name, suffix);
+		char *nsuffix = xattr_userns_name(name, suffix,
+						  current_user_ns());
 
 		if (IS_ERR(nsuffix))
 			return PTR_ERR(nsuffix);
@@ -794,7 +832,7 @@ __vfs_removexattr(struct dentry *dentry, const char *name)
 		return PTR_ERR(handler);
 	if (!handler->set)
 		return -EOPNOTSUPP;
-	nsuffix = xattr_userns_name(fullname, name);
+	nsuffix = xattr_userns_name(fullname, name, current_user_ns());
 	if (IS_ERR(nsuffix))
 		return PTR_ERR(nsuffix);
 	ret = handler->set(handler, dentry, inode, nsuffix, NULL, 0,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 3/3] Enable security.selinux in user namespaces
       [not found] ` <1498157989-11814-1-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2017-06-22 18:59   ` [PATCH 1/3] xattr: Enable security.capability in user namespaces Stefan Berger
  2017-06-22 18:59     ` Stefan Berger
@ 2017-06-22 18:59   ` Stefan Berger
  2017-06-22 19:59   ` [PATCH 0/3] Enable namespaced file capabilities Casey Schaufler
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Before the current modifications, SELinux extended attributes were
visible inside the user namespace but changes in patch 1 hid them.
This patch enables security.selinux in user namespaces and allows
them to be written to in the same way as security.capability.

Signed-off-by: Stefan Berger <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/xattr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xattr.c b/fs/xattr.c
index 045be85..37686ee 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char *name, int mask)
  */
 static const char *const userns_xattrs[] = {
 	XATTR_NAME_CAPS,
+	XATTR_NAME_SELINUX,
 	NULL
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 3/3] Enable security.selinux in user namespaces
  2017-06-22 18:59 ` Stefan Berger
  (?)
@ 2017-06-22 18:59   ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, stefanb, vgoyal, amir73il,
	linux-security-module

Before the current modifications, SELinux extended attributes were
visible inside the user namespace but changes in patch 1 hid them.
This patch enables security.selinux in user namespaces and allows
them to be written to in the same way as security.capability.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
---
 fs/xattr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xattr.c b/fs/xattr.c
index 045be85..37686ee 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char *name, int mask)
  */
 static const char *const userns_xattrs[] = {
 	XATTR_NAME_CAPS,
+	XATTR_NAME_SELINUX,
 	NULL
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-22 18:59   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: linux-security-module

Before the current modifications, SELinux extended attributes were
visible inside the user namespace but changes in patch 1 hid them.
This patch enables security.selinux in user namespaces and allows
them to be written to in the same way as security.capability.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
---
 fs/xattr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xattr.c b/fs/xattr.c
index 045be85..37686ee 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char *name, int mask)
  */
 static const char *const userns_xattrs[] = {
 	XATTR_NAME_CAPS,
+	XATTR_NAME_SELINUX,
 	NULL
 };
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 180+ messages in thread

* [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-22 18:59   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 704 bytes --]

Before the current modifications, SELinux extended attributes were
visible inside the user namespace but changes in patch 1 hid them.
This patch enables security.selinux in user namespaces and allows
them to be written to in the same way as security.capability.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
---
 fs/xattr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xattr.c b/fs/xattr.c
index 045be85..37686ee 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char *name, int mask)
  */
 static const char *const userns_xattrs[] = {
 	XATTR_NAME_CAPS,
+	XATTR_NAME_SELINUX,
 	NULL
 };
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found] ` <1498157989-11814-1-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-06-22 18:59   ` [PATCH 3/3] Enable security.selinux in user namespaces Stefan Berger
@ 2017-06-22 19:59   ` Casey Schaufler
  2017-06-22 23:29   ` James Bottomley
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 19:59 UTC (permalink / raw)
  To: Stefan Berger, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 6/22/2017 11:59 AM, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.

You need to identify the instance of the user namespace for
this to work right on a system with multiple user namespaces.
If I have a shared filesystem mounted in two different user
namespaces a change by one will affect the other.

... unless I'm missing something obvious about namespace behavior.

> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.
>
> To maintain compatibility with existing behavior, the value of
> security.capability of the host is shown inside the user namespace
> once the security.capability of the user namespace has been removed
> (which really removes security.capability@uid=1000). Writing to
> an extended attribute inside a user namespace effectively hides the
> extended attribute of the host.
>
> The general framework that is established with these patches can
> be applied to other extended attributes as well, such as security.ima
> or the 'trusted.' prefix . Another extended attribute that needed to
> be enabled here is 'security.selinux,' since otherwise this extended
> attribute would not be shown anymore inside a user namespace.
>
> Regards,
>    Stefan & Serge
>
>
> Stefan Berger (3):
>   xattr: Enable security.capability in user namespaces
>   Enable capabilities of files from shared filesystem
>   Enable security.selinux in user namespaces
>
>  fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>  security/commoncap.c     |  36 +++-
>  security/selinux/hooks.c |   9 +-
>  3 files changed, 501 insertions(+), 16 deletions(-)
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 18:59 ` Stefan Berger
  (?)
@ 2017-06-22 19:59   ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 19:59 UTC (permalink / raw)
  To: Stefan Berger, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, vgoyal, amir73il,
	linux-security-module

On 6/22/2017 11:59 AM, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.

You need to identify the instance of the user namespace for
this to work right on a system with multiple user namespaces.
If I have a shared filesystem mounted in two different user
namespaces a change by one will affect the other.

... unless I'm missing something obvious about namespace behavior.

> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.
>
> To maintain compatibility with existing behavior, the value of
> security.capability of the host is shown inside the user namespace
> once the security.capability of the user namespace has been removed
> (which really removes security.capability@uid=1000). Writing to
> an extended attribute inside a user namespace effectively hides the
> extended attribute of the host.
>
> The general framework that is established with these patches can
> be applied to other extended attributes as well, such as security.ima
> or the 'trusted.' prefix . Another extended attribute that needed to
> be enabled here is 'security.selinux,' since otherwise this extended
> attribute would not be shown anymore inside a user namespace.
>
> Regards,
>    Stefan & Serge
>
>
> Stefan Berger (3):
>   xattr: Enable security.capability in user namespaces
>   Enable capabilities of files from shared filesystem
>   Enable security.selinux in user namespaces
>
>  fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>  security/commoncap.c     |  36 +++-
>  security/selinux/hooks.c |   9 +-
>  3 files changed, 501 insertions(+), 16 deletions(-)
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 19:59   ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 19:59 UTC (permalink / raw)
  To: linux-security-module

On 6/22/2017 11:59 AM, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability at uid=1000 for root mapped to uid 1000 on the host.

You need to identify the instance of the user namespace for
this to work right on a system with multiple user namespaces.
If I have a shared filesystem mounted in two different user
namespaces a change by one will affect the other.

... unless I'm missing something obvious about namespace behavior.

> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability at uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability at uid=1000, is visible.
>
> To maintain compatibility with existing behavior, the value of
> security.capability of the host is shown inside the user namespace
> once the security.capability of the user namespace has been removed
> (which really removes security.capability at uid=1000). Writing to
> an extended attribute inside a user namespace effectively hides the
> extended attribute of the host.
>
> The general framework that is established with these patches can
> be applied to other extended attributes as well, such as security.ima
> or the 'trusted.' prefix . Another extended attribute that needed to
> be enabled here is 'security.selinux,' since otherwise this extended
> attribute would not be shown anymore inside a user namespace.
>
> Regards,
>    Stefan & Serge
>
>
> Stefan Berger (3):
>   xattr: Enable security.capability in user namespaces
>   Enable capabilities of files from shared filesystem
>   Enable security.selinux in user namespaces
>
>  fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>  security/commoncap.c     |  36 +++-
>  security/selinux/hooks.c |   9 +-
>  3 files changed, 501 insertions(+), 16 deletions(-)
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 19:59   ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 19:59 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2490 bytes --]

On 6/22/2017 11:59 AM, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.

You need to identify the instance of the user namespace for
this to work right on a system with multiple user namespaces.
If I have a shared filesystem mounted in two different user
namespaces a change by one will affect the other.

... unless I'm missing something obvious about namespace behavior.

> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability(a)uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability(a)uid=1000, is visible.
>
> To maintain compatibility with existing behavior, the value of
> security.capability of the host is shown inside the user namespace
> once the security.capability of the user namespace has been removed
> (which really removes security.capability(a)uid=1000). Writing to
> an extended attribute inside a user namespace effectively hides the
> extended attribute of the host.
>
> The general framework that is established with these patches can
> be applied to other extended attributes as well, such as security.ima
> or the 'trusted.' prefix . Another extended attribute that needed to
> be enabled here is 'security.selinux,' since otherwise this extended
> attribute would not be shown anymore inside a user namespace.
>
> Regards,
>    Stefan & Serge
>
>
> Stefan Berger (3):
>   xattr: Enable security.capability in user namespaces
>   Enable capabilities of files from shared filesystem
>   Enable security.selinux in user namespaces
>
>  fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>  security/commoncap.c     |  36 +++-
>  security/selinux/hooks.c |   9 +-
>  3 files changed, 501 insertions(+), 16 deletions(-)
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]   ` <70a09f1b-e82c-a25c-9325-d5d757b1b695-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
@ 2017-06-22 20:12     ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 20:12 UTC (permalink / raw)
  To: Casey Schaufler, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>> This series of patches primary goal is to enable file capabilities
>> in user namespaces without affecting the file capabilities that are
>> effective on the host. This is to prevent that any unprivileged user
>> on the host maps his own uid to root in a private namespace, writes
>> the xattr, and executes the file with privilege on the host.
>>
>> We achieve this goal by writing extended attributes with a different
>> name when a user namespace is used. If for example the root user
>> in a user namespace writes the security.capability xattr, the name
>> of the xattr that is actually written is encoded as
>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> You need to identify the instance of the user namespace for
> this to work right on a system with multiple user namespaces.
> If I have a shared filesystem mounted in two different user
> namespaces a change by one will affect the other.

Two different user namespaces with different uid mappings will not 
affect each other.

If root in userns1 mapped to uid 1000 (size 1000) writes 
security.capability, it will write security.capability@uid=1000 into the fs.
If root in userns2 mapped to uid 2000 (size 1000) writes 
security.capability, it will write security.capability@uid=2000 into the fs.

Neither of the two will see each other's security.capability, but each 
will see their own 'security.capability'.

Assume now userns1 has a size of 2000, so overlapping with userns2, it 
will now see userns2's security.capability@uid=1000 as well as its own 
'security.capability'. security.capability@uid=1000 (of userns2) in 
userns1 will not have an effect on effective file capabilities.

> ... unless I'm missing something obvious about namespace behavior.
>
>> When listing the xattrs on the host, the existing security.capability
>> as well as the security.capability@uid=1000 will be shown. Inside the
>> namespace only 'security.capability', with the value of
>> security.capability@uid=1000, is visible.
>>
>> To maintain compatibility with existing behavior, the value of
>> security.capability of the host is shown inside the user namespace
>> once the security.capability of the user namespace has been removed
>> (which really removes security.capability@uid=1000). Writing to
>> an extended attribute inside a user namespace effectively hides the
>> extended attribute of the host.
>>
>> The general framework that is established with these patches can
>> be applied to other extended attributes as well, such as security.ima
>> or the 'trusted.' prefix . Another extended attribute that needed to
>> be enabled here is 'security.selinux,' since otherwise this extended
>> attribute would not be shown anymore inside a user namespace.
>>
>> Regards,
>>     Stefan & Serge
>>
>>
>> Stefan Berger (3):
>>    xattr: Enable security.capability in user namespaces
>>    Enable capabilities of files from shared filesystem
>>    Enable security.selinux in user namespaces
>>
>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   security/commoncap.c     |  36 +++-
>>   security/selinux/hooks.c |   9 +-
>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 19:59   ` Casey Schaufler
  (?)
@ 2017-06-22 20:12     ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 20:12 UTC (permalink / raw)
  To: Casey Schaufler, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, vgoyal, amir73il,
	linux-security-module

On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>> This series of patches primary goal is to enable file capabilities
>> in user namespaces without affecting the file capabilities that are
>> effective on the host. This is to prevent that any unprivileged user
>> on the host maps his own uid to root in a private namespace, writes
>> the xattr, and executes the file with privilege on the host.
>>
>> We achieve this goal by writing extended attributes with a different
>> name when a user namespace is used. If for example the root user
>> in a user namespace writes the security.capability xattr, the name
>> of the xattr that is actually written is encoded as
>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> You need to identify the instance of the user namespace for
> this to work right on a system with multiple user namespaces.
> If I have a shared filesystem mounted in two different user
> namespaces a change by one will affect the other.

Two different user namespaces with different uid mappings will not 
affect each other.

If root in userns1 mapped to uid 1000 (size 1000) writes 
security.capability, it will write security.capability@uid=1000 into the fs.
If root in userns2 mapped to uid 2000 (size 1000) writes 
security.capability, it will write security.capability@uid=2000 into the fs.

Neither of the two will see each other's security.capability, but each 
will see their own 'security.capability'.

Assume now userns1 has a size of 2000, so overlapping with userns2, it 
will now see userns2's security.capability@uid=1000 as well as its own 
'security.capability'. security.capability@uid=1000 (of userns2) in 
userns1 will not have an effect on effective file capabilities.

> ... unless I'm missing something obvious about namespace behavior.
>
>> When listing the xattrs on the host, the existing security.capability
>> as well as the security.capability@uid=1000 will be shown. Inside the
>> namespace only 'security.capability', with the value of
>> security.capability@uid=1000, is visible.
>>
>> To maintain compatibility with existing behavior, the value of
>> security.capability of the host is shown inside the user namespace
>> once the security.capability of the user namespace has been removed
>> (which really removes security.capability@uid=1000). Writing to
>> an extended attribute inside a user namespace effectively hides the
>> extended attribute of the host.
>>
>> The general framework that is established with these patches can
>> be applied to other extended attributes as well, such as security.ima
>> or the 'trusted.' prefix . Another extended attribute that needed to
>> be enabled here is 'security.selinux,' since otherwise this extended
>> attribute would not be shown anymore inside a user namespace.
>>
>> Regards,
>>     Stefan & Serge
>>
>>
>> Stefan Berger (3):
>>    xattr: Enable security.capability in user namespaces
>>    Enable capabilities of files from shared filesystem
>>    Enable security.selinux in user namespaces
>>
>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   security/commoncap.c     |  36 +++-
>>   security/selinux/hooks.c |   9 +-
>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 20:12     ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 20:12 UTC (permalink / raw)
  To: linux-security-module

On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>> This series of patches primary goal is to enable file capabilities
>> in user namespaces without affecting the file capabilities that are
>> effective on the host. This is to prevent that any unprivileged user
>> on the host maps his own uid to root in a private namespace, writes
>> the xattr, and executes the file with privilege on the host.
>>
>> We achieve this goal by writing extended attributes with a different
>> name when a user namespace is used. If for example the root user
>> in a user namespace writes the security.capability xattr, the name
>> of the xattr that is actually written is encoded as
>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> You need to identify the instance of the user namespace for
> this to work right on a system with multiple user namespaces.
> If I have a shared filesystem mounted in two different user
> namespaces a change by one will affect the other.

Two different user namespaces with different uid mappings will not 
affect each other.

If root in userns1 mapped to uid 1000 (size 1000) writes 
security.capability, it will write security.capability at uid=1000 into the fs.
If root in userns2 mapped to uid 2000 (size 1000) writes 
security.capability, it will write security.capability at uid=2000 into the fs.

Neither of the two will see each other's security.capability, but each 
will see their own 'security.capability'.

Assume now userns1 has a size of 2000, so overlapping with userns2, it 
will now see userns2's security.capability at uid=1000 as well as its own 
'security.capability'. security.capability at uid=1000 (of userns2) in 
userns1 will not have an effect on effective file capabilities.

> ... unless I'm missing something obvious about namespace behavior.
>
>> When listing the xattrs on the host, the existing security.capability
>> as well as the security.capability at uid=1000 will be shown. Inside the
>> namespace only 'security.capability', with the value of
>> security.capability at uid=1000, is visible.
>>
>> To maintain compatibility with existing behavior, the value of
>> security.capability of the host is shown inside the user namespace
>> once the security.capability of the user namespace has been removed
>> (which really removes security.capability at uid=1000). Writing to
>> an extended attribute inside a user namespace effectively hides the
>> extended attribute of the host.
>>
>> The general framework that is established with these patches can
>> be applied to other extended attributes as well, such as security.ima
>> or the 'trusted.' prefix . Another extended attribute that needed to
>> be enabled here is 'security.selinux,' since otherwise this extended
>> attribute would not be shown anymore inside a user namespace.
>>
>> Regards,
>>     Stefan & Serge
>>
>>
>> Stefan Berger (3):
>>    xattr: Enable security.capability in user namespaces
>>    Enable capabilities of files from shared filesystem
>>    Enable security.selinux in user namespaces
>>
>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   security/commoncap.c     |  36 +++-
>>   security/selinux/hooks.c |   9 +-
>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 20:12     ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 20:12 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3376 bytes --]

On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>> This series of patches primary goal is to enable file capabilities
>> in user namespaces without affecting the file capabilities that are
>> effective on the host. This is to prevent that any unprivileged user
>> on the host maps his own uid to root in a private namespace, writes
>> the xattr, and executes the file with privilege on the host.
>>
>> We achieve this goal by writing extended attributes with a different
>> name when a user namespace is used. If for example the root user
>> in a user namespace writes the security.capability xattr, the name
>> of the xattr that is actually written is encoded as
>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
> You need to identify the instance of the user namespace for
> this to work right on a system with multiple user namespaces.
> If I have a shared filesystem mounted in two different user
> namespaces a change by one will affect the other.

Two different user namespaces with different uid mappings will not 
affect each other.

If root in userns1 mapped to uid 1000 (size 1000) writes 
security.capability, it will write security.capability(a)uid=1000 into the fs.
If root in userns2 mapped to uid 2000 (size 1000) writes 
security.capability, it will write security.capability(a)uid=2000 into the fs.

Neither of the two will see each other's security.capability, but each 
will see their own 'security.capability'.

Assume now userns1 has a size of 2000, so overlapping with userns2, it 
will now see userns2's security.capability(a)uid=1000 as well as its own 
'security.capability'. security.capability(a)uid=1000 (of userns2) in 
userns1 will not have an effect on effective file capabilities.

> ... unless I'm missing something obvious about namespace behavior.
>
>> When listing the xattrs on the host, the existing security.capability
>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>> namespace only 'security.capability', with the value of
>> security.capability(a)uid=1000, is visible.
>>
>> To maintain compatibility with existing behavior, the value of
>> security.capability of the host is shown inside the user namespace
>> once the security.capability of the user namespace has been removed
>> (which really removes security.capability(a)uid=1000). Writing to
>> an extended attribute inside a user namespace effectively hides the
>> extended attribute of the host.
>>
>> The general framework that is established with these patches can
>> be applied to other extended attributes as well, such as security.ima
>> or the 'trusted.' prefix . Another extended attribute that needed to
>> be enabled here is 'security.selinux,' since otherwise this extended
>> attribute would not be shown anymore inside a user namespace.
>>
>> Regards,
>>     Stefan & Serge
>>
>>
>> Stefan Berger (3):
>>    xattr: Enable security.capability in user namespaces
>>    Enable capabilities of files from shared filesystem
>>    Enable security.selinux in user namespaces
>>
>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   security/commoncap.c     |  36 +++-
>>   security/selinux/hooks.c |   9 +-
>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]     ` <10fb9c1b-e9af-336c-9a1b-cf95259cfaf3-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2017-06-22 20:33       ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 20:33 UTC (permalink / raw)
  To: Stefan Berger, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 6/22/2017 1:12 PM, Stefan Berger wrote:
> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>> You need to identify the instance of the user namespace for
>> this to work right on a system with multiple user namespaces.
>> If I have a shared filesystem mounted in two different user
>> namespaces a change by one will affect the other.
>
> Two different user namespaces with different uid mappings will not affect each other.

But two namespaces with the same uid mapping will, and I
don't think this meets the principle of least astonishment.
I also object to associating capabilities with UIDs. The
whole point of capabilities is to disassociate UID 0 from
privilege. What you've done is explicitly associate a UID
with the ability to have privilege. That's an architectural
regression.

>
> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability@uid=1000 into the fs.
> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability@uid=2000 into the fs.
>
> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>
> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability@uid=1000 as well as its own 'security.capability'. security.capability@uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>
>> ... unless I'm missing something obvious about namespace behavior.
>>
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability@uid=1000, is visible.
>>>
>>> To maintain compatibility with existing behavior, the value of
>>> security.capability of the host is shown inside the user namespace
>>> once the security.capability of the user namespace has been removed
>>> (which really removes security.capability@uid=1000). Writing to
>>> an extended attribute inside a user namespace effectively hides the
>>> extended attribute of the host.
>>>
>>> The general framework that is established with these patches can
>>> be applied to other extended attributes as well, such as security.ima
>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>> be enabled here is 'security.selinux,' since otherwise this extended
>>> attribute would not be shown anymore inside a user namespace.
>>>
>>> Regards,
>>>     Stefan & Serge
>>>
>>>
>>> Stefan Berger (3):
>>>    xattr: Enable security.capability in user namespaces
>>>    Enable capabilities of files from shared filesystem
>>>    Enable security.selinux in user namespaces
>>>
>>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>   security/commoncap.c     |  36 +++-
>>>   security/selinux/hooks.c |   9 +-
>>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>>
>
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 20:12     ` Stefan Berger
  (?)
@ 2017-06-22 20:33       ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 20:33 UTC (permalink / raw)
  To: Stefan Berger, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, vgoyal, amir73il,
	linux-security-module

On 6/22/2017 1:12 PM, Stefan Berger wrote:
> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>> You need to identify the instance of the user namespace for
>> this to work right on a system with multiple user namespaces.
>> If I have a shared filesystem mounted in two different user
>> namespaces a change by one will affect the other.
>
> Two different user namespaces with different uid mappings will not affect each other.

But two namespaces with the same uid mapping will, and I
don't think this meets the principle of least astonishment.
I also object to associating capabilities with UIDs. The
whole point of capabilities is to disassociate UID 0 from
privilege. What you've done is explicitly associate a UID
with the ability to have privilege. That's an architectural
regression.

>
> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability@uid=1000 into the fs.
> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability@uid=2000 into the fs.
>
> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>
> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability@uid=1000 as well as its own 'security.capability'. security.capability@uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>
>> ... unless I'm missing something obvious about namespace behavior.
>>
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability@uid=1000, is visible.
>>>
>>> To maintain compatibility with existing behavior, the value of
>>> security.capability of the host is shown inside the user namespace
>>> once the security.capability of the user namespace has been removed
>>> (which really removes security.capability@uid=1000). Writing to
>>> an extended attribute inside a user namespace effectively hides the
>>> extended attribute of the host.
>>>
>>> The general framework that is established with these patches can
>>> be applied to other extended attributes as well, such as security.ima
>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>> be enabled here is 'security.selinux,' since otherwise this extended
>>> attribute would not be shown anymore inside a user namespace.
>>>
>>> Regards,
>>>     Stefan & Serge
>>>
>>>
>>> Stefan Berger (3):
>>>    xattr: Enable security.capability in user namespaces
>>>    Enable capabilities of files from shared filesystem
>>>    Enable security.selinux in user namespaces
>>>
>>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>   security/commoncap.c     |  36 +++-
>>>   security/selinux/hooks.c |   9 +-
>>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>>
>
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 20:33       ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 20:33 UTC (permalink / raw)
  To: linux-security-module

On 6/22/2017 1:12 PM, Stefan Berger wrote:
> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>> You need to identify the instance of the user namespace for
>> this to work right on a system with multiple user namespaces.
>> If I have a shared filesystem mounted in two different user
>> namespaces a change by one will affect the other.
>
> Two different user namespaces with different uid mappings will not affect each other.

But two namespaces with the same uid mapping will, and I
don't think this meets the principle of least astonishment.
I also object to associating capabilities with UIDs. The
whole point of capabilities is to disassociate UID 0 from
privilege. What you've done is explicitly associate a UID
with the ability to have privilege. That's an architectural
regression.

>
> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability at uid=1000 into the fs.
> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability at uid=2000 into the fs.
>
> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>
> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability at uid=1000 as well as its own 'security.capability'. security.capability at uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>
>> ... unless I'm missing something obvious about namespace behavior.
>>
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability at uid=1000, is visible.
>>>
>>> To maintain compatibility with existing behavior, the value of
>>> security.capability of the host is shown inside the user namespace
>>> once the security.capability of the user namespace has been removed
>>> (which really removes security.capability at uid=1000). Writing to
>>> an extended attribute inside a user namespace effectively hides the
>>> extended attribute of the host.
>>>
>>> The general framework that is established with these patches can
>>> be applied to other extended attributes as well, such as security.ima
>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>> be enabled here is 'security.selinux,' since otherwise this extended
>>> attribute would not be shown anymore inside a user namespace.
>>>
>>> Regards,
>>>     Stefan & Serge
>>>
>>>
>>> Stefan Berger (3):
>>>    xattr: Enable security.capability in user namespaces
>>>    Enable capabilities of files from shared filesystem
>>>    Enable security.selinux in user namespaces
>>>
>>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>   security/commoncap.c     |  36 +++-
>>>   security/selinux/hooks.c |   9 +-
>>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 20:33       ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 20:33 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3852 bytes --]

On 6/22/2017 1:12 PM, Stefan Berger wrote:
> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>> You need to identify the instance of the user namespace for
>> this to work right on a system with multiple user namespaces.
>> If I have a shared filesystem mounted in two different user
>> namespaces a change by one will affect the other.
>
> Two different user namespaces with different uid mappings will not affect each other.

But two namespaces with the same uid mapping will, and I
don't think this meets the principle of least astonishment.
I also object to associating capabilities with UIDs. The
whole point of capabilities is to disassociate UID 0 from
privilege. What you've done is explicitly associate a UID
with the ability to have privilege. That's an architectural
regression.

>
> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability(a)uid=1000 into the fs.
> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability(a)uid=2000 into the fs.
>
> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>
> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability(a)uid=1000 as well as its own 'security.capability'. security.capability(a)uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>
>> ... unless I'm missing something obvious about namespace behavior.
>>
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability(a)uid=1000, is visible.
>>>
>>> To maintain compatibility with existing behavior, the value of
>>> security.capability of the host is shown inside the user namespace
>>> once the security.capability of the user namespace has been removed
>>> (which really removes security.capability(a)uid=1000). Writing to
>>> an extended attribute inside a user namespace effectively hides the
>>> extended attribute of the host.
>>>
>>> The general framework that is established with these patches can
>>> be applied to other extended attributes as well, such as security.ima
>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>> be enabled here is 'security.selinux,' since otherwise this extended
>>> attribute would not be shown anymore inside a user namespace.
>>>
>>> Regards,
>>>     Stefan & Serge
>>>
>>>
>>> Stefan Berger (3):
>>>    xattr: Enable security.capability in user namespaces
>>>    Enable capabilities of files from shared filesystem
>>>    Enable security.selinux in user namespaces
>>>
>>>   fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>   security/commoncap.c     |  36 +++-
>>>   security/selinux/hooks.c |   9 +-
>>>   3 files changed, 501 insertions(+), 16 deletions(-)
>>>
>
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 20:33       ` Casey Schaufler
  (?)
  (?)
@ 2017-06-22 21:03           ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 21:03 UTC (permalink / raw)
  To: Casey Schaufler, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 06/22/2017 04:33 PM, Casey Schaufler wrote:
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>> You need to identify the instance of the user namespace for
>>> this to work right on a system with multiple user namespaces.
>>> If I have a shared filesystem mounted in two different user
>>> namespaces a change by one will affect the other.
>> Two different user namespaces with different uid mappings will not affect each other.
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.
> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

It has privilege within the bounding set of the capabilities that it is 
given. Afaik, a process cannot gain additional capabilities through file 
capabilities. Allowing to set a process's file capabilities allows one 
to _restrict_ what it can do, which is useful for shared filesystems 
where I can now set my ping capabilities to cap_net_raw, overriding the 
ones one the host which could be cap_net_admin+cap_net_raw. So I don't 
need to extend my bounding set with cap_net_admin or mess with xattrs on 
the host.


>
>> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability@uid=1000 into the fs.
>> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability@uid=2000 into the fs.
>>
>> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>>
>> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability@uid=1000 as well as its own 'security.capability'. security.capability@uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>>
>>> ... unless I'm missing something obvious about namespace behavior.
>>>
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability@uid=1000, is visible.
>>>>
>>>> To maintain compatibility with existing behavior, the value of
>>>> security.capability of the host is shown inside the user namespace
>>>> once the security.capability of the user namespace has been removed
>>>> (which really removes security.capability@uid=1000). Writing to
>>>> an extended attribute inside a user namespace effectively hides the
>>>> extended attribute of the host.
>>>>
>>>> The general framework that is established with these patches can
>>>> be applied to other extended attributes as well, such as security.ima
>>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>>> be enabled here is 'security.selinux,' since otherwise this extended
>>>> attribute would not be shown anymore inside a user namespace.
>>>>
>>>> Regards,
>>>>      Stefan & Serge
>>>>
>>>>
>>>> Stefan Berger (3):
>>>>     xattr: Enable security.capability in user namespaces
>>>>     Enable capabilities of files from shared filesystem
>>>>     Enable security.selinux in user namespaces
>>>>
>>>>    fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    security/commoncap.c     |  36 +++-
>>>>    security/selinux/hooks.c |   9 +-
>>>>    3 files changed, 501 insertions(+), 16 deletions(-)
>>>>
>>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 21:03           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 21:03 UTC (permalink / raw)
  To: Casey Schaufler, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, vgoyal, amir73il,
	linux-security-module

On 06/22/2017 04:33 PM, Casey Schaufler wrote:
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>> You need to identify the instance of the user namespace for
>>> this to work right on a system with multiple user namespaces.
>>> If I have a shared filesystem mounted in two different user
>>> namespaces a change by one will affect the other.
>> Two different user namespaces with different uid mappings will not affect each other.
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.
> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

It has privilege within the bounding set of the capabilities that it is 
given. Afaik, a process cannot gain additional capabilities through file 
capabilities. Allowing to set a process's file capabilities allows one 
to _restrict_ what it can do, which is useful for shared filesystems 
where I can now set my ping capabilities to cap_net_raw, overriding the 
ones one the host which could be cap_net_admin+cap_net_raw. So I don't 
need to extend my bounding set with cap_net_admin or mess with xattrs on 
the host.


>
>> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability@uid=1000 into the fs.
>> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability@uid=2000 into the fs.
>>
>> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>>
>> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability@uid=1000 as well as its own 'security.capability'. security.capability@uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>>
>>> ... unless I'm missing something obvious about namespace behavior.
>>>
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability@uid=1000, is visible.
>>>>
>>>> To maintain compatibility with existing behavior, the value of
>>>> security.capability of the host is shown inside the user namespace
>>>> once the security.capability of the user namespace has been removed
>>>> (which really removes security.capability@uid=1000). Writing to
>>>> an extended attribute inside a user namespace effectively hides the
>>>> extended attribute of the host.
>>>>
>>>> The general framework that is established with these patches can
>>>> be applied to other extended attributes as well, such as security.ima
>>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>>> be enabled here is 'security.selinux,' since otherwise this extended
>>>> attribute would not be shown anymore inside a user namespace.
>>>>
>>>> Regards,
>>>>      Stefan & Serge
>>>>
>>>>
>>>> Stefan Berger (3):
>>>>     xattr: Enable security.capability in user namespaces
>>>>     Enable capabilities of files from shared filesystem
>>>>     Enable security.selinux in user namespaces
>>>>
>>>>    fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    security/commoncap.c     |  36 +++-
>>>>    security/selinux/hooks.c |   9 +-
>>>>    3 files changed, 501 insertions(+), 16 deletions(-)
>>>>
>>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 21:03           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 21:03 UTC (permalink / raw)
  To: linux-security-module

On 06/22/2017 04:33 PM, Casey Schaufler wrote:
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>> You need to identify the instance of the user namespace for
>>> this to work right on a system with multiple user namespaces.
>>> If I have a shared filesystem mounted in two different user
>>> namespaces a change by one will affect the other.
>> Two different user namespaces with different uid mappings will not affect each other.
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.
> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

It has privilege within the bounding set of the capabilities that it is 
given. Afaik, a process cannot gain additional capabilities through file 
capabilities. Allowing to set a process's file capabilities allows one 
to _restrict_ what it can do, which is useful for shared filesystems 
where I can now set my ping capabilities to cap_net_raw, overriding the 
ones one the host which could be cap_net_admin+cap_net_raw. So I don't 
need to extend my bounding set with cap_net_admin or mess with xattrs on 
the host.


>
>> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability at uid=1000 into the fs.
>> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability at uid=2000 into the fs.
>>
>> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>>
>> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability at uid=1000 as well as its own 'security.capability'. security.capability at uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>>
>>> ... unless I'm missing something obvious about namespace behavior.
>>>
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability at uid=1000, is visible.
>>>>
>>>> To maintain compatibility with existing behavior, the value of
>>>> security.capability of the host is shown inside the user namespace
>>>> once the security.capability of the user namespace has been removed
>>>> (which really removes security.capability at uid=1000). Writing to
>>>> an extended attribute inside a user namespace effectively hides the
>>>> extended attribute of the host.
>>>>
>>>> The general framework that is established with these patches can
>>>> be applied to other extended attributes as well, such as security.ima
>>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>>> be enabled here is 'security.selinux,' since otherwise this extended
>>>> attribute would not be shown anymore inside a user namespace.
>>>>
>>>> Regards,
>>>>      Stefan & Serge
>>>>
>>>>
>>>> Stefan Berger (3):
>>>>     xattr: Enable security.capability in user namespaces
>>>>     Enable capabilities of files from shared filesystem
>>>>     Enable security.selinux in user namespaces
>>>>
>>>>    fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    security/commoncap.c     |  36 +++-
>>>>    security/selinux/hooks.c |   9 +-
>>>>    3 files changed, 501 insertions(+), 16 deletions(-)
>>>>
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 21:03           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 21:03 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 4506 bytes --]

On 06/22/2017 04:33 PM, Casey Schaufler wrote:
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>> You need to identify the instance of the user namespace for
>>> this to work right on a system with multiple user namespaces.
>>> If I have a shared filesystem mounted in two different user
>>> namespaces a change by one will affect the other.
>> Two different user namespaces with different uid mappings will not affect each other.
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.
> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

It has privilege within the bounding set of the capabilities that it is 
given. Afaik, a process cannot gain additional capabilities through file 
capabilities. Allowing to set a process's file capabilities allows one 
to _restrict_ what it can do, which is useful for shared filesystems 
where I can now set my ping capabilities to cap_net_raw, overriding the 
ones one the host which could be cap_net_admin+cap_net_raw. So I don't 
need to extend my bounding set with cap_net_admin or mess with xattrs on 
the host.


>
>> If root in userns1 mapped to uid 1000 (size 1000) writes security.capability, it will write security.capability(a)uid=1000 into the fs.
>> If root in userns2 mapped to uid 2000 (size 1000) writes security.capability, it will write security.capability(a)uid=2000 into the fs.
>>
>> Neither of the two will see each other's security.capability, but each will see their own 'security.capability'.
>>
>> Assume now userns1 has a size of 2000, so overlapping with userns2, it will now see userns2's security.capability(a)uid=1000 as well as its own 'security.capability'. security.capability(a)uid=1000 (of userns2) in userns1 will not have an effect on effective file capabilities.
>>
>>> ... unless I'm missing something obvious about namespace behavior.
>>>
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability(a)uid=1000, is visible.
>>>>
>>>> To maintain compatibility with existing behavior, the value of
>>>> security.capability of the host is shown inside the user namespace
>>>> once the security.capability of the user namespace has been removed
>>>> (which really removes security.capability(a)uid=1000). Writing to
>>>> an extended attribute inside a user namespace effectively hides the
>>>> extended attribute of the host.
>>>>
>>>> The general framework that is established with these patches can
>>>> be applied to other extended attributes as well, such as security.ima
>>>> or the 'trusted.' prefix . Another extended attribute that needed to
>>>> be enabled here is 'security.selinux,' since otherwise this extended
>>>> attribute would not be shown anymore inside a user namespace.
>>>>
>>>> Regards,
>>>>      Stefan & Serge
>>>>
>>>>
>>>> Stefan Berger (3):
>>>>     xattr: Enable security.capability in user namespaces
>>>>     Enable capabilities of files from shared filesystem
>>>>     Enable security.selinux in user namespaces
>>>>
>>>>    fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    security/commoncap.c     |  36 +++-
>>>>    security/selinux/hooks.c |   9 +-
>>>>    3 files changed, 501 insertions(+), 16 deletions(-)
>>>>
>>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 20:33       ` Casey Schaufler
  (?)
@ 2017-06-22 21:09           ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 21:09 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
> > On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> >> On 6/22/2017 11:59 AM, Stefan Berger wrote:
> >>> This series of patches primary goal is to enable file capabilities
> >>> in user namespaces without affecting the file capabilities that are
> >>> effective on the host. This is to prevent that any unprivileged user
> >>> on the host maps his own uid to root in a private namespace, writes
> >>> the xattr, and executes the file with privilege on the host.
> >>>
> >>> We achieve this goal by writing extended attributes with a different
> >>> name when a user namespace is used. If for example the root user
> >>> in a user namespace writes the security.capability xattr, the name
> >>> of the xattr that is actually written is encoded as
> >>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >> You need to identify the instance of the user namespace for
> >> this to work right on a system with multiple user namespaces.
> >> If I have a shared filesystem mounted in two different user
> >> namespaces a change by one will affect the other.
> >
> > Two different user namespaces with different uid mappings will not affect each other.
> 
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.

It does.  If you have one filesystem shared among multiple
containers, then it needs to be either read-only, or you
need to know what you're doing.

> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

IMO this is looking at it the wrong way.  From inside the container's
viewpoint, the capabilities are not associated with a uid.  Any
task, regardles off uid, in the container, which executes the file,
gets the privilege.  IMO that satisfies the intent of file capabilities.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 21:09           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 21:09 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Stefan Berger, ebiederm, containers, lkp, xiaolong.ye,
	linux-kernel, zohar, serge, tycho, James.Bottomley,
	christian.brauner, vgoyal, amir73il, linux-security-module

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
> > On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> >> On 6/22/2017 11:59 AM, Stefan Berger wrote:
> >>> This series of patches primary goal is to enable file capabilities
> >>> in user namespaces without affecting the file capabilities that are
> >>> effective on the host. This is to prevent that any unprivileged user
> >>> on the host maps his own uid to root in a private namespace, writes
> >>> the xattr, and executes the file with privilege on the host.
> >>>
> >>> We achieve this goal by writing extended attributes with a different
> >>> name when a user namespace is used. If for example the root user
> >>> in a user namespace writes the security.capability xattr, the name
> >>> of the xattr that is actually written is encoded as
> >>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >> You need to identify the instance of the user namespace for
> >> this to work right on a system with multiple user namespaces.
> >> If I have a shared filesystem mounted in two different user
> >> namespaces a change by one will affect the other.
> >
> > Two different user namespaces with different uid mappings will not affect each other.
> 
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.

It does.  If you have one filesystem shared among multiple
containers, then it needs to be either read-only, or you
need to know what you're doing.

> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

IMO this is looking at it the wrong way.  From inside the container's
viewpoint, the capabilities are not associated with a uid.  Any
task, regardles off uid, in the container, which executes the file,
gets the privilege.  IMO that satisfies the intent of file capabilities.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 21:09           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 21:09 UTC (permalink / raw)
  To: linux-security-module

Quoting Casey Schaufler (casey at schaufler-ca.com):
> On 6/22/2017 1:12 PM, Stefan Berger wrote:
> > On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> >> On 6/22/2017 11:59 AM, Stefan Berger wrote:
> >>> This series of patches primary goal is to enable file capabilities
> >>> in user namespaces without affecting the file capabilities that are
> >>> effective on the host. This is to prevent that any unprivileged user
> >>> on the host maps his own uid to root in a private namespace, writes
> >>> the xattr, and executes the file with privilege on the host.
> >>>
> >>> We achieve this goal by writing extended attributes with a different
> >>> name when a user namespace is used. If for example the root user
> >>> in a user namespace writes the security.capability xattr, the name
> >>> of the xattr that is actually written is encoded as
> >>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> >> You need to identify the instance of the user namespace for
> >> this to work right on a system with multiple user namespaces.
> >> If I have a shared filesystem mounted in two different user
> >> namespaces a change by one will affect the other.
> >
> > Two different user namespaces with different uid mappings will not affect each other.
> 
> But two namespaces with the same uid mapping will, and I
> don't think this meets the principle of least astonishment.

It does.  If you have one filesystem shared among multiple
containers, then it needs to be either read-only, or you
need to know what you're doing.

> I also object to associating capabilities with UIDs. The
> whole point of capabilities is to disassociate UID 0 from
> privilege. What you've done is explicitly associate a UID
> with the ability to have privilege. That's an architectural
> regression.

IMO this is looking at it the wrong way.  From inside the container's
viewpoint, the capabilities are not associated with a uid.  Any
task, regardles off uid, in the container, which executes the file,
gets the privilege.  IMO that satisfies the intent of file capabilities.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]           ` <20170622210925.GA32691-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-22 22:40             ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 22:40 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
>> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>> You need to identify the instance of the user namespace for
>>>> this to work right on a system with multiple user namespaces.
>>>> If I have a shared filesystem mounted in two different user
>>>> namespaces a change by one will affect the other.
>>> Two different user namespaces with different uid mappings will not affect each other.
>> But two namespaces with the same uid mapping will, and I
>> don't think this meets the principle of least astonishment.
> It does.  If you have one filesystem shared among multiple
> containers, then it needs to be either read-only, or you
> need to know what you're doing.

Joe's a junior devop who has been given a container
template which he tweaks for various nefarious purposes.
He doesn't know much about what he's doing. He isn't
changing the UIDs the template uses because, quite frankly,
he doesn't know a UID from an entrenching tool. He has
changed a filesystem from RO to RW because he read on a
forum somewhere that doing so would fix a problem he had
once. He doesn't want to have that problem again, so he
left the change in the template.

Containers are being sold as a way to make things easier.
This sort of side effect is dangerous in an environment
where users are being told that they don't have to worry
so much, the environment will take care of them. 

>> I also object to associating capabilities with UIDs. The
>> whole point of capabilities is to disassociate UID 0 from
>> privilege. What you've done is explicitly associate a UID
>> with the ability to have privilege. That's an architectural
>> regression.
> IMO this is looking at it the wrong way.

The right way to look at the problem is to identify the
capabilities the program ought to have and set the file
capabilities and UID/GID properly on the program on the
base system. If you have to fix the program so it works
right under those conditions, so much the better for
everyone. If you're running with different capabilities
in a container to prevent the program from doing damage
to the base system, maybe the program needs fixing instead.

> From inside the container's
> viewpoint, the capabilities are not associated with a uid.  Any
> task, regardles off uid, in the container, which executes the file,
> gets the privilege.  IMO that satisfies the intent of file capabilities.

The UID is the wrong association. The namespace is the correct association.
You're using the UID because it's something that's different in the
namespace than in the base system. You can detect it. What you need is a
non-volatile namespace id to attach to the file rather than using the
UID mapping (which may not be unique) that the namespace uses.

> -serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 21:09           ` Serge E. Hallyn
  (?)
@ 2017-06-22 22:40             ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 22:40 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Stefan Berger, ebiederm, containers, lkp, xiaolong.ye,
	linux-kernel, zohar, tycho, James.Bottomley, christian.brauner,
	vgoyal, amir73il, linux-security-module

On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey@schaufler-ca.com):
>> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>> You need to identify the instance of the user namespace for
>>>> this to work right on a system with multiple user namespaces.
>>>> If I have a shared filesystem mounted in two different user
>>>> namespaces a change by one will affect the other.
>>> Two different user namespaces with different uid mappings will not affect each other.
>> But two namespaces with the same uid mapping will, and I
>> don't think this meets the principle of least astonishment.
> It does.  If you have one filesystem shared among multiple
> containers, then it needs to be either read-only, or you
> need to know what you're doing.

Joe's a junior devop who has been given a container
template which he tweaks for various nefarious purposes.
He doesn't know much about what he's doing. He isn't
changing the UIDs the template uses because, quite frankly,
he doesn't know a UID from an entrenching tool. He has
changed a filesystem from RO to RW because he read on a
forum somewhere that doing so would fix a problem he had
once. He doesn't want to have that problem again, so he
left the change in the template.

Containers are being sold as a way to make things easier.
This sort of side effect is dangerous in an environment
where users are being told that they don't have to worry
so much, the environment will take care of them. 

>> I also object to associating capabilities with UIDs. The
>> whole point of capabilities is to disassociate UID 0 from
>> privilege. What you've done is explicitly associate a UID
>> with the ability to have privilege. That's an architectural
>> regression.
> IMO this is looking at it the wrong way.

The right way to look at the problem is to identify the
capabilities the program ought to have and set the file
capabilities and UID/GID properly on the program on the
base system. If you have to fix the program so it works
right under those conditions, so much the better for
everyone. If you're running with different capabilities
in a container to prevent the program from doing damage
to the base system, maybe the program needs fixing instead.

> From inside the container's
> viewpoint, the capabilities are not associated with a uid.  Any
> task, regardles off uid, in the container, which executes the file,
> gets the privilege.  IMO that satisfies the intent of file capabilities.

The UID is the wrong association. The namespace is the correct association.
You're using the UID because it's something that's different in the
namespace than in the base system. You can detect it. What you need is a
non-volatile namespace id to attach to the file rather than using the
UID mapping (which may not be unique) that the namespace uses.

> -serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 22:40             ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 22:40 UTC (permalink / raw)
  To: linux-security-module

On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey at schaufler-ca.com):
>> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>> You need to identify the instance of the user namespace for
>>>> this to work right on a system with multiple user namespaces.
>>>> If I have a shared filesystem mounted in two different user
>>>> namespaces a change by one will affect the other.
>>> Two different user namespaces with different uid mappings will not affect each other.
>> But two namespaces with the same uid mapping will, and I
>> don't think this meets the principle of least astonishment.
> It does.  If you have one filesystem shared among multiple
> containers, then it needs to be either read-only, or you
> need to know what you're doing.

Joe's a junior devop who has been given a container
template which he tweaks for various nefarious purposes.
He doesn't know much about what he's doing. He isn't
changing the UIDs the template uses because, quite frankly,
he doesn't know a UID from an entrenching tool. He has
changed a filesystem from RO to RW because he read on a
forum somewhere that doing so would fix a problem he had
once. He doesn't want to have that problem again, so he
left the change in the template.

Containers are being sold as a way to make things easier.
This sort of side effect is dangerous in an environment
where users are being told that they don't have to worry
so much, the environment will take care of them. 

>> I also object to associating capabilities with UIDs. The
>> whole point of capabilities is to disassociate UID 0 from
>> privilege. What you've done is explicitly associate a UID
>> with the ability to have privilege. That's an architectural
>> regression.
> IMO this is looking at it the wrong way.

The right way to look at the problem is to identify the
capabilities the program ought to have and set the file
capabilities and UID/GID properly on the program on the
base system. If you have to fix the program so it works
right under those conditions, so much the better for
everyone. If you're running with different capabilities
in a container to prevent the program from doing damage
to the base system, maybe the program needs fixing instead.

> From inside the container's
> viewpoint, the capabilities are not associated with a uid.  Any
> task, regardles off uid, in the container, which executes the file,
> gets the privilege.  IMO that satisfies the intent of file capabilities.

The UID is the wrong association. The namespace is the correct association.
You're using the UID because it's something that's different in the
namespace than in the base system. You can detect it. What you need is a
non-volatile namespace id to attach to the file rather than using the
UID mapping (which may not be unique) that the namespace uses.

> -serge

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 22:40             ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-22 22:40 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3719 bytes --]

On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey(a)schaufler-ca.com):
>> On 6/22/2017 1:12 PM, Stefan Berger wrote:
>>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
>>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>> You need to identify the instance of the user namespace for
>>>> this to work right on a system with multiple user namespaces.
>>>> If I have a shared filesystem mounted in two different user
>>>> namespaces a change by one will affect the other.
>>> Two different user namespaces with different uid mappings will not affect each other.
>> But two namespaces with the same uid mapping will, and I
>> don't think this meets the principle of least astonishment.
> It does.  If you have one filesystem shared among multiple
> containers, then it needs to be either read-only, or you
> need to know what you're doing.

Joe's a junior devop who has been given a container
template which he tweaks for various nefarious purposes.
He doesn't know much about what he's doing. He isn't
changing the UIDs the template uses because, quite frankly,
he doesn't know a UID from an entrenching tool. He has
changed a filesystem from RO to RW because he read on a
forum somewhere that doing so would fix a problem he had
once. He doesn't want to have that problem again, so he
left the change in the template.

Containers are being sold as a way to make things easier.
This sort of side effect is dangerous in an environment
where users are being told that they don't have to worry
so much, the environment will take care of them. 

>> I also object to associating capabilities with UIDs. The
>> whole point of capabilities is to disassociate UID 0 from
>> privilege. What you've done is explicitly associate a UID
>> with the ability to have privilege. That's an architectural
>> regression.
> IMO this is looking at it the wrong way.

The right way to look at the problem is to identify the
capabilities the program ought to have and set the file
capabilities and UID/GID properly on the program on the
base system. If you have to fix the program so it works
right under those conditions, so much the better for
everyone. If you're running with different capabilities
in a container to prevent the program from doing damage
to the base system, maybe the program needs fixing instead.

> From inside the container's
> viewpoint, the capabilities are not associated with a uid.  Any
> task, regardles off uid, in the container, which executes the file,
> gets the privilege.  IMO that satisfies the intent of file capabilities.

The UID is the wrong association. The namespace is the correct association.
You're using the UID because it's something that's different in the
namespace than in the base system. You can detect it. What you need is a
non-volatile namespace id to attach to the file rather than using the
UID mapping (which may not be unique) that the namespace uses.

> -serge


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]             ` <45e59e2e-0e00-cb9a-2f85-dc4606338a08-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
@ 2017-06-22 23:07               ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:07 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> >> On 6/22/2017 1:12 PM, Stefan Berger wrote:
> >>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> >>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
> >>>>> This series of patches primary goal is to enable file capabilities
> >>>>> in user namespaces without affecting the file capabilities that are
> >>>>> effective on the host. This is to prevent that any unprivileged user
> >>>>> on the host maps his own uid to root in a private namespace, writes
> >>>>> the xattr, and executes the file with privilege on the host.
> >>>>>
> >>>>> We achieve this goal by writing extended attributes with a different
> >>>>> name when a user namespace is used. If for example the root user
> >>>>> in a user namespace writes the security.capability xattr, the name
> >>>>> of the xattr that is actually written is encoded as
> >>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >>>> You need to identify the instance of the user namespace for
> >>>> this to work right on a system with multiple user namespaces.
> >>>> If I have a shared filesystem mounted in two different user
> >>>> namespaces a change by one will affect the other.
> >>> Two different user namespaces with different uid mappings will not affect each other.
> >> But two namespaces with the same uid mapping will, and I
> >> don't think this meets the principle of least astonishment.
> > It does.  If you have one filesystem shared among multiple
> > containers, then it needs to be either read-only, or you
> > need to know what you're doing.
> 
> Joe's a junior devop who has been given a container
> template which he tweaks for various nefarious purposes.
> He doesn't know much about what he's doing. He isn't
> changing the UIDs the template uses because, quite frankly,
> he doesn't know a UID from an entrenching tool. He has
> changed a filesystem from RO to RW because he read on a
> forum somewhere that doing so would fix a problem he had
> once. He doesn't want to have that problem again, so he
> left the change in the template.
> 
> Containers are being sold as a way to make things easier.
> This sort of side effect is dangerous in an environment
> where users are being told that they don't have to worry
> so much, the environment will take care of them. 
> 
> >> I also object to associating capabilities with UIDs. The
> >> whole point of capabilities is to disassociate UID 0 from
> >> privilege. What you've done is explicitly associate a UID
> >> with the ability to have privilege. That's an architectural
> >> regression.
> > IMO this is looking at it the wrong way.
> 
> The right way to look at the problem is to identify the
> capabilities the program ought to have and set the file
> capabilities and UID/GID properly on the program on the
> base system.

No.

Absolutely not.

That would require me to be given CAP_SETFCAP on the host in
order to control the resources I've been delegated in a user
namespace.  That's not how it works.

Using only /usr/bin/newuidmap and /usr/bin/newgidmap, which
allow me to map the subuids which I have been delegated through
/etc/subuid and /etc/subgid, I can, as an unprivileged user, and
with no other privilege, create a full container image, start it
up, and administer it.

The fact that I cannot also install software with file capabilities
is a shortcoming.

>  If you have to fix the program so it works
> right under those conditions, so much the better for
> everyone. If you're running with different capabilities
> in a container to prevent the program from doing damage
> to the base system, maybe the program needs fixing instead.

That is not the reason to do this.

Root in the container is assigning file capabilities for the
usual reason - to allow the file to be executed, by anyone,
regardless of uid (mapped into the namespace), with certain privilege.

The privilege which root in the container is allowed to
delegate is only the privilege which it *has* in the container.

If we allow root in the container to assign a 'global'
security.capability, then we are allow root in the container
to hand privilege to an unprivileged user on the host, against
host resources.

> > From inside the container's
> > viewpoint, the capabilities are not associated with a uid.  Any
> > task, regardles off uid, in the container, which executes the file,
> > gets the privilege.  IMO that satisfies the intent of file capabilities.
> 
> The UID is the wrong association. The namespace is the correct association.

That's a pleasant but impractical thought.  Namespaces do not have any
persistent ids.  (And if we tried, we'd be told no because it would
require a namespace of namespaces).

> You're using the UID because it's something that's different in the
> namespace than in the base system.

I'm using the uid because that is the subject which was granted privilege
over all other ids mapped into its user namespace.

>  You can detect it. What you need is a
> non-volatile namespace id to attach to the file rather than using the
> UID mapping (which may not be unique) that the namespace uses.

That's what we were trying to do in 2010.  It didn't work.  Which is
how we have the uid namespace as it exists.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 22:40             ` Casey Schaufler
@ 2017-06-22 23:07               ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:07 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Serge E. Hallyn, Stefan Berger, ebiederm, containers, lkp,
	xiaolong.ye, linux-kernel, zohar, tycho, James.Bottomley,
	christian.brauner, vgoyal, amir73il, linux-security-module

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey@schaufler-ca.com):
> >> On 6/22/2017 1:12 PM, Stefan Berger wrote:
> >>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> >>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
> >>>>> This series of patches primary goal is to enable file capabilities
> >>>>> in user namespaces without affecting the file capabilities that are
> >>>>> effective on the host. This is to prevent that any unprivileged user
> >>>>> on the host maps his own uid to root in a private namespace, writes
> >>>>> the xattr, and executes the file with privilege on the host.
> >>>>>
> >>>>> We achieve this goal by writing extended attributes with a different
> >>>>> name when a user namespace is used. If for example the root user
> >>>>> in a user namespace writes the security.capability xattr, the name
> >>>>> of the xattr that is actually written is encoded as
> >>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >>>> You need to identify the instance of the user namespace for
> >>>> this to work right on a system with multiple user namespaces.
> >>>> If I have a shared filesystem mounted in two different user
> >>>> namespaces a change by one will affect the other.
> >>> Two different user namespaces with different uid mappings will not affect each other.
> >> But two namespaces with the same uid mapping will, and I
> >> don't think this meets the principle of least astonishment.
> > It does.  If you have one filesystem shared among multiple
> > containers, then it needs to be either read-only, or you
> > need to know what you're doing.
> 
> Joe's a junior devop who has been given a container
> template which he tweaks for various nefarious purposes.
> He doesn't know much about what he's doing. He isn't
> changing the UIDs the template uses because, quite frankly,
> he doesn't know a UID from an entrenching tool. He has
> changed a filesystem from RO to RW because he read on a
> forum somewhere that doing so would fix a problem he had
> once. He doesn't want to have that problem again, so he
> left the change in the template.
> 
> Containers are being sold as a way to make things easier.
> This sort of side effect is dangerous in an environment
> where users are being told that they don't have to worry
> so much, the environment will take care of them. 
> 
> >> I also object to associating capabilities with UIDs. The
> >> whole point of capabilities is to disassociate UID 0 from
> >> privilege. What you've done is explicitly associate a UID
> >> with the ability to have privilege. That's an architectural
> >> regression.
> > IMO this is looking at it the wrong way.
> 
> The right way to look at the problem is to identify the
> capabilities the program ought to have and set the file
> capabilities and UID/GID properly on the program on the
> base system.

No.

Absolutely not.

That would require me to be given CAP_SETFCAP on the host in
order to control the resources I've been delegated in a user
namespace.  That's not how it works.

Using only /usr/bin/newuidmap and /usr/bin/newgidmap, which
allow me to map the subuids which I have been delegated through
/etc/subuid and /etc/subgid, I can, as an unprivileged user, and
with no other privilege, create a full container image, start it
up, and administer it.

The fact that I cannot also install software with file capabilities
is a shortcoming.

>  If you have to fix the program so it works
> right under those conditions, so much the better for
> everyone. If you're running with different capabilities
> in a container to prevent the program from doing damage
> to the base system, maybe the program needs fixing instead.

That is not the reason to do this.

Root in the container is assigning file capabilities for the
usual reason - to allow the file to be executed, by anyone,
regardless of uid (mapped into the namespace), with certain privilege.

The privilege which root in the container is allowed to
delegate is only the privilege which it *has* in the container.

If we allow root in the container to assign a 'global'
security.capability, then we are allow root in the container
to hand privilege to an unprivileged user on the host, against
host resources.

> > From inside the container's
> > viewpoint, the capabilities are not associated with a uid.  Any
> > task, regardles off uid, in the container, which executes the file,
> > gets the privilege.  IMO that satisfies the intent of file capabilities.
> 
> The UID is the wrong association. The namespace is the correct association.

That's a pleasant but impractical thought.  Namespaces do not have any
persistent ids.  (And if we tried, we'd be told no because it would
require a namespace of namespaces).

> You're using the UID because it's something that's different in the
> namespace than in the base system.

I'm using the uid because that is the subject which was granted privilege
over all other ids mapped into its user namespace.

>  You can detect it. What you need is a
> non-volatile namespace id to attach to the file rather than using the
> UID mapping (which may not be unique) that the namespace uses.

That's what we were trying to do in 2010.  It didn't work.  Which is
how we have the uid namespace as it exists.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 23:07               ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:07 UTC (permalink / raw)
  To: linux-security-module

Quoting Casey Schaufler (casey at schaufler-ca.com):
> On 6/22/2017 2:09 PM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey at schaufler-ca.com):
> >> On 6/22/2017 1:12 PM, Stefan Berger wrote:
> >>> On 06/22/2017 03:59 PM, Casey Schaufler wrote:
> >>>> On 6/22/2017 11:59 AM, Stefan Berger wrote:
> >>>>> This series of patches primary goal is to enable file capabilities
> >>>>> in user namespaces without affecting the file capabilities that are
> >>>>> effective on the host. This is to prevent that any unprivileged user
> >>>>> on the host maps his own uid to root in a private namespace, writes
> >>>>> the xattr, and executes the file with privilege on the host.
> >>>>>
> >>>>> We achieve this goal by writing extended attributes with a different
> >>>>> name when a user namespace is used. If for example the root user
> >>>>> in a user namespace writes the security.capability xattr, the name
> >>>>> of the xattr that is actually written is encoded as
> >>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> >>>> You need to identify the instance of the user namespace for
> >>>> this to work right on a system with multiple user namespaces.
> >>>> If I have a shared filesystem mounted in two different user
> >>>> namespaces a change by one will affect the other.
> >>> Two different user namespaces with different uid mappings will not affect each other.
> >> But two namespaces with the same uid mapping will, and I
> >> don't think this meets the principle of least astonishment.
> > It does.  If you have one filesystem shared among multiple
> > containers, then it needs to be either read-only, or you
> > need to know what you're doing.
> 
> Joe's a junior devop who has been given a container
> template which he tweaks for various nefarious purposes.
> He doesn't know much about what he's doing. He isn't
> changing the UIDs the template uses because, quite frankly,
> he doesn't know a UID from an entrenching tool. He has
> changed a filesystem from RO to RW because he read on a
> forum somewhere that doing so would fix a problem he had
> once. He doesn't want to have that problem again, so he
> left the change in the template.
> 
> Containers are being sold as a way to make things easier.
> This sort of side effect is dangerous in an environment
> where users are being told that they don't have to worry
> so much, the environment will take care of them. 
> 
> >> I also object to associating capabilities with UIDs. The
> >> whole point of capabilities is to disassociate UID 0 from
> >> privilege. What you've done is explicitly associate a UID
> >> with the ability to have privilege. That's an architectural
> >> regression.
> > IMO this is looking at it the wrong way.
> 
> The right way to look at the problem is to identify the
> capabilities the program ought to have and set the file
> capabilities and UID/GID properly on the program on the
> base system.

No.

Absolutely not.

That would require me to be given CAP_SETFCAP on the host in
order to control the resources I've been delegated in a user
namespace.  That's not how it works.

Using only /usr/bin/newuidmap and /usr/bin/newgidmap, which
allow me to map the subuids which I have been delegated through
/etc/subuid and /etc/subgid, I can, as an unprivileged user, and
with no other privilege, create a full container image, start it
up, and administer it.

The fact that I cannot also install software with file capabilities
is a shortcoming.

>  If you have to fix the program so it works
> right under those conditions, so much the better for
> everyone. If you're running with different capabilities
> in a container to prevent the program from doing damage
> to the base system, maybe the program needs fixing instead.

That is not the reason to do this.

Root in the container is assigning file capabilities for the
usual reason - to allow the file to be executed, by anyone,
regardless of uid (mapped into the namespace), with certain privilege.

The privilege which root in the container is allowed to
delegate is only the privilege which it *has* in the container.

If we allow root in the container to assign a 'global'
security.capability, then we are allow root in the container
to hand privilege to an unprivileged user on the host, against
host resources.

> > From inside the container's
> > viewpoint, the capabilities are not associated with a uid.  Any
> > task, regardles off uid, in the container, which executes the file,
> > gets the privilege.  IMO that satisfies the intent of file capabilities.
> 
> The UID is the wrong association. The namespace is the correct association.

That's a pleasant but impractical thought.  Namespaces do not have any
persistent ids.  (And if we tried, we'd be told no because it would
require a namespace of namespaces).

> You're using the UID because it's something that's different in the
> namespace than in the base system.

I'm using the uid because that is the subject which was granted privilege
over all other ids mapped into its user namespace.

>  You can detect it. What you need is a
> non-volatile namespace id to attach to the file rather than using the
> UID mapping (which may not be unique) that the namespace uses.

That's what we were trying to do in 2010.  It didn't work.  Which is
how we have the uid namespace as it exists.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found] ` <1498157989-11814-1-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-06-22 19:59   ` [PATCH 0/3] Enable namespaced file capabilities Casey Schaufler
@ 2017-06-22 23:29   ` James Bottomley
  2017-06-23  7:01     ` Amir Goldstein
  2017-06-23 20:09     ` Vivek Goyal
  6 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-22 23:29 UTC (permalink / raw)
  To: Stefan Berger, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.

I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
capability but am dynamically mapping the namespaces (i.e. I know I
want unprivileged root, but I'm going to dynamically select the range
to map based on what's currently available on the orchestration
system).  If we stick with the @uid=X suffix, then dynamic mapping
won't work because X is potentially different each time and there'll be
a name mismatch in my xattrs.  Why not just make the suffix @uid, which
means if root is mapped to any unprivileged uid then we pick this up
otherwise we go with the unsuffixed property?

As far as I can see there's no real advantage to discriminating userns
specific xattrs based on where root is mapped to, unless there's a use
case I'm missing?

James

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 18:59 ` Stefan Berger
  (?)
@ 2017-06-22 23:29   ` James Bottomley
  -1 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-22 23:29 UTC (permalink / raw)
  To: Stefan Berger, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	christian.brauner, vgoyal, amir73il, linux-security-module

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.

I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
capability but am dynamically mapping the namespaces (i.e. I know I
want unprivileged root, but I'm going to dynamically select the range
to map based on what's currently available on the orchestration
system).  If we stick with the @uid=X suffix, then dynamic mapping
won't work because X is potentially different each time and there'll be
a name mismatch in my xattrs.  Why not just make the suffix @uid, which
means if root is mapped to any unprivileged uid then we pick this up
otherwise we go with the unsuffixed property?

As far as I can see there's no real advantage to discriminating userns
specific xattrs based on where root is mapped to, unless there's a use
case I'm missing?

James

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 23:29   ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-22 23:29 UTC (permalink / raw)
  To: linux-security-module

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability at uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability at uid=1000, is visible.

I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
capability but am dynamically mapping the namespaces (i.e. I know I
want unprivileged root, but I'm going to dynamically select the range
to map based on what's currently available on the orchestration
system).  If we stick with the @uid=X suffix, then dynamic mapping
won't work because X is potentially different each time and there'll be
a name mismatch in my xattrs.  Why not just make the suffix @uid, which
means if root is mapped to any unprivileged uid then we pick this up
otherwise we go with the unsuffixed property?

As far as I can see there's no real advantage to discriminating userns
specific xattrs based on where root is mapped to, unless there's a use
case I'm missing?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 23:29   ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-22 23:29 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability(a)uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability(a)uid=1000, is visible.

I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
capability but am dynamically mapping the namespaces (i.e. I know I
want unprivileged root, but I'm going to dynamically select the range
to map based on what's currently available on the orchestration
system).  If we stick with the @uid=X suffix, then dynamic mapping
won't work because X is potentially different each time and there'll be
a name mismatch in my xattrs.  Why not just make the suffix @uid, which
means if root is mapped to any unprivileged uid then we pick this up
otherwise we go with the unsuffixed property?

As far as I can see there's no real advantage to discriminating userns
specific xattrs based on where root is mapped to, unless there's a use
case I'm missing?

James



^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]   ` <1498174161.7636.4.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-06-22 23:32     ` Serge E. Hallyn
  2017-06-22 23:36     ` Serge E. Hallyn
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:32 UTC (permalink / raw)
  To: James Bottomley
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

Quoting James Bottomley (James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org):
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> 
> I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
> capability but am dynamically mapping the namespaces (i.e. I know I
> want unprivileged root, but I'm going to dynamically select the range
> to map based on what's currently available on the orchestration
> system).  If we stick with the @uid=X suffix, then dynamic mapping
> won't work because X is potentially different each time and there'll be

Note that if you just set a 'security.capability' xattr, it will apply
to all namespaces.

Does that address your concern?

> a name mismatch in my xattrs.  Why not just make the suffix @uid, which
> means if root is mapped to any unprivileged uid then we pick this up
> otherwise we go with the unsuffixed property?
> 
> As far as I can see there's no real advantage to discriminating userns
> specific xattrs based on where root is mapped to, unless there's a use
> case I'm missing?
> 
> James
> 

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 23:29   ` James Bottomley
@ 2017-06-22 23:32     ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:32 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Berger, ebiederm, containers, lkp, xiaolong.ye,
	linux-kernel, zohar, serge, tycho, christian.brauner, vgoyal,
	amir73il, linux-security-module

Quoting James Bottomley (James.Bottomley@HansenPartnership.com):
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> 
> I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
> capability but am dynamically mapping the namespaces (i.e. I know I
> want unprivileged root, but I'm going to dynamically select the range
> to map based on what's currently available on the orchestration
> system).  If we stick with the @uid=X suffix, then dynamic mapping
> won't work because X is potentially different each time and there'll be

Note that if you just set a 'security.capability' xattr, it will apply
to all namespaces.

Does that address your concern?

> a name mismatch in my xattrs.  Why not just make the suffix @uid, which
> means if root is mapped to any unprivileged uid then we pick this up
> otherwise we go with the unsuffixed property?
> 
> As far as I can see there's no real advantage to discriminating userns
> specific xattrs based on where root is mapped to, unless there's a use
> case I'm missing?
> 
> James
> 

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 23:32     ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:32 UTC (permalink / raw)
  To: linux-security-module

Quoting James Bottomley (James.Bottomley at HansenPartnership.com):
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability at uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability at uid=1000, is visible.
> 
> I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
> capability but am dynamically mapping the namespaces (i.e. I know I
> want unprivileged root, but I'm going to dynamically select the range
> to map based on what's currently available on the orchestration
> system).  If we stick with the @uid=X suffix, then dynamic mapping
> won't work because X is potentially different each time and there'll be

Note that if you just set a 'security.capability' xattr, it will apply
to all namespaces.

Does that address your concern?

> a name mismatch in my xattrs.  Why not just make the suffix @uid, which
> means if root is mapped to any unprivileged uid then we pick this up
> otherwise we go with the unsuffixed property?
> 
> As far as I can see there's no real advantage to discriminating userns
> specific xattrs based on where root is mapped to, unless there's a use
> case I'm missing?
> 
> James
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]   ` <1498174161.7636.4.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-06-22 23:32     ` Serge E. Hallyn
@ 2017-06-22 23:36     ` Serge E. Hallyn
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

Quoting James Bottomley (James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org):
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> 
> I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
> capability but am dynamically mapping the namespaces (i.e. I know I
> want unprivileged root, but I'm going to dynamically select the range
> to map based on what's currently available on the orchestration
> system).  If we stick with the @uid=X suffix, then dynamic mapping
> won't work because X is potentially different each time and there'll be
> a name mismatch in my xattrs.  Why not just make the suffix @uid, which
> means if root is mapped to any unprivileged uid then we pick this up
> otherwise we go with the unsuffixed property?
> 
> As far as I can see there's no real advantage to discriminating userns
> specific xattrs based on where root is mapped to, unless there's a use
> case I'm missing?

Yes, the use case is: to allow root in the container to set the
privilege itself, without endangering any resources not owned by
that root.  If you're going to have a root owned host-wide
orchestration system setting up the rootfs, then you don't
necessary need this at all.

As you say a @uid to say "any unprivileged userns" might be useful.
The implication is that root on the host doesn't trust the image
enough to write a real global file capability, but trusts it enough
to 'endanger' all containers on the host.  If that's the case, I have
no objection to adding this as a feature.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 23:29   ` James Bottomley
@ 2017-06-22 23:36     ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: Stefan Berger, ebiederm, containers, lkp, xiaolong.ye,
	linux-kernel, zohar, serge, tycho, christian.brauner, vgoyal,
	amir73il, linux-security-module

Quoting James Bottomley (James.Bottomley@HansenPartnership.com):
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> 
> I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
> capability but am dynamically mapping the namespaces (i.e. I know I
> want unprivileged root, but I'm going to dynamically select the range
> to map based on what's currently available on the orchestration
> system).  If we stick with the @uid=X suffix, then dynamic mapping
> won't work because X is potentially different each time and there'll be
> a name mismatch in my xattrs.  Why not just make the suffix @uid, which
> means if root is mapped to any unprivileged uid then we pick this up
> otherwise we go with the unsuffixed property?
> 
> As far as I can see there's no real advantage to discriminating userns
> specific xattrs based on where root is mapped to, unless there's a use
> case I'm missing?

Yes, the use case is: to allow root in the container to set the
privilege itself, without endangering any resources not owned by
that root.  If you're going to have a root owned host-wide
orchestration system setting up the rootfs, then you don't
necessary need this at all.

As you say a @uid to say "any unprivileged userns" might be useful.
The implication is that root on the host doesn't trust the image
enough to write a real global file capability, but trusts it enough
to 'endanger' all containers on the host.  If that's the case, I have
no objection to adding this as a feature.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 23:36     ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-22 23:36 UTC (permalink / raw)
  To: linux-security-module

Quoting James Bottomley (James.Bottomley at HansenPartnership.com):
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability at uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability at uid=1000, is visible.
> 
> I'm a bit bothered by the @uid=1000 suffix.  What if I want to use this
> capability but am dynamically mapping the namespaces (i.e. I know I
> want unprivileged root, but I'm going to dynamically select the range
> to map based on what's currently available on the orchestration
> system).  If we stick with the @uid=X suffix, then dynamic mapping
> won't work because X is potentially different each time and there'll be
> a name mismatch in my xattrs.  Why not just make the suffix @uid, which
> means if root is mapped to any unprivileged uid then we pick this up
> otherwise we go with the unsuffixed property?
> 
> As far as I can see there's no real advantage to discriminating userns
> specific xattrs based on where root is mapped to, unless there's a use
> case I'm missing?

Yes, the use case is: to allow root in the container to set the
privilege itself, without endangering any resources not owned by
that root.  If you're going to have a root owned host-wide
orchestration system setting up the rootfs, then you don't
necessary need this at all.

As you say a @uid to say "any unprivileged userns" might be useful.
The implication is that root on the host doesn't trust the image
enough to write a real global file capability, but trusts it enough
to 'endanger' all containers on the host.  If that's the case, I have
no objection to adding this as a feature.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]     ` <20170622233619.GC2894-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-23  0:13       ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23  0:13 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: lkp-JC7UmRfGjtg,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> Quoting James Bottomley (James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org):
> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file 
> > > capabilities in user namespaces without affecting the file 
> > > capabilities that are effective on the host. This is to prevent 
> > > that any unprivileged user on the host maps his own uid to root 
> > > in a private namespace, writes the xattr, and executes the file
> > > with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a 
> > > different name when a user namespace is used. If for example the 
> > > root user in a user namespace writes the security.capability 
> > > xattr, the name of the xattr that is actually written is encoded 
> > > as security.capability@uid=1000 for root mapped to uid 1000 on 
> > > the host. When listing the xattrs on the host, the existing
> > > security.capability as well as the security.capability@uid=1000 
> > > will be shown. Inside the namespace only 'security.capability', 
> > > with the value of security.capability@uid=1000, is visible.
> > 
> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
> > this capability but am dynamically mapping the namespaces (i.e. I 
> > know I want unprivileged root, but I'm going to dynamically select 
> > the range to map based on what's currently available on the 
> > orchestration system).  If we stick with the @uid=X suffix, then 
> > dynamic mapping won't work because X is potentially different each 
> > time and there'll be a name mismatch in my xattrs.  Why not just 
> > make the suffix @uid, which means if root is mapped to any 
> > unprivileged uid then we pick this up otherwise we go with the
> > unsuffixed property?
> > 
> > As far as I can see there's no real advantage to discriminating 
> > userns specific xattrs based on where root is mapped to, unless 
> > there's a use case I'm missing?
> 
> Yes, the use case is: to allow root in the container to set the
> privilege itself, without endangering any resources not owned by
> that root.

OK, so you envisage the same filesystem being mounted in different user
namespaces and being able to see their own value for the xattr.  It
still seems a bit weird that they'd be able to change file contents and
have that seen by the other userns but not xattrs.

>   If you're going to have a root owned host-wide
> orchestration system setting up the rootfs, then you don't
> necessary need this at all.

I wasn't thinking it would be root owned, just that it would have a
predefined range of allowed uids and be able to map multiple containers
to subsets of these.

> As you say a @uid to say "any unprivileged userns" might be useful.
> The implication is that root on the host doesn't trust the image
> enough to write a real global file capability, but trusts it enough
> to 'endanger' all containers on the host.  If that's the case, I have
> no objection to adding this as a feature.

Yes, precisely.  The filesystem is certified as permitted to override
the xattr whatever unprivileged mapping for root is in place.

How would we effect the switch?  I suppose some global flag because I
can't see we'd be mixing use cases in a physical system.

James

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 23:36     ` Serge E. Hallyn
  (?)
@ 2017-06-23  0:13       ` James Bottomley
  -1 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23  0:13 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: zohar, containers, linux-kernel, xiaolong.ye,
	linux-security-module, ebiederm, lkp

On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> Quoting James Bottomley (James.Bottomley@HansenPartnership.com):
> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file 
> > > capabilities in user namespaces without affecting the file 
> > > capabilities that are effective on the host. This is to prevent 
> > > that any unprivileged user on the host maps his own uid to root 
> > > in a private namespace, writes the xattr, and executes the file
> > > with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a 
> > > different name when a user namespace is used. If for example the 
> > > root user in a user namespace writes the security.capability 
> > > xattr, the name of the xattr that is actually written is encoded 
> > > as security.capability@uid=1000 for root mapped to uid 1000 on 
> > > the host. When listing the xattrs on the host, the existing
> > > security.capability as well as the security.capability@uid=1000 
> > > will be shown. Inside the namespace only 'security.capability', 
> > > with the value of security.capability@uid=1000, is visible.
> > 
> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
> > this capability but am dynamically mapping the namespaces (i.e. I 
> > know I want unprivileged root, but I'm going to dynamically select 
> > the range to map based on what's currently available on the 
> > orchestration system).  If we stick with the @uid=X suffix, then 
> > dynamic mapping won't work because X is potentially different each 
> > time and there'll be a name mismatch in my xattrs.  Why not just 
> > make the suffix @uid, which means if root is mapped to any 
> > unprivileged uid then we pick this up otherwise we go with the
> > unsuffixed property?
> > 
> > As far as I can see there's no real advantage to discriminating 
> > userns specific xattrs based on where root is mapped to, unless 
> > there's a use case I'm missing?
> 
> Yes, the use case is: to allow root in the container to set the
> privilege itself, without endangering any resources not owned by
> that root.

OK, so you envisage the same filesystem being mounted in different user
namespaces and being able to see their own value for the xattr.  It
still seems a bit weird that they'd be able to change file contents and
have that seen by the other userns but not xattrs.

>   If you're going to have a root owned host-wide
> orchestration system setting up the rootfs, then you don't
> necessary need this at all.

I wasn't thinking it would be root owned, just that it would have a
predefined range of allowed uids and be able to map multiple containers
to subsets of these.

> As you say a @uid to say "any unprivileged userns" might be useful.
> The implication is that root on the host doesn't trust the image
> enough to write a real global file capability, but trusts it enough
> to 'endanger' all containers on the host.  If that's the case, I have
> no objection to adding this as a feature.

Yes, precisely.  The filesystem is certified as permitted to override
the xattr whatever unprivileged mapping for root is in place.

How would we effect the switch?  I suppose some global flag because I
can't see we'd be mixing use cases in a physical system.

James

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23  0:13       ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23  0:13 UTC (permalink / raw)
  To: linux-security-module

On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> Quoting James Bottomley (James.Bottomley at HansenPartnership.com):
> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file 
> > > capabilities in user namespaces without affecting the file 
> > > capabilities that are effective on the host. This is to prevent 
> > > that any unprivileged user on the host maps his own uid to root 
> > > in a private namespace, writes the xattr, and executes the file
> > > with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a 
> > > different name when a user namespace is used. If for example the 
> > > root user in a user namespace writes the security.capability 
> > > xattr, the name of the xattr that is actually written is encoded 
> > > as security.capability at uid=1000 for root mapped to uid 1000 on 
> > > the host. When listing the xattrs on the host, the existing
> > > security.capability as well as the security.capability at uid=1000 
> > > will be shown. Inside the namespace only 'security.capability', 
> > > with the value of security.capability at uid=1000, is visible.
> > 
> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
> > this capability but am dynamically mapping the namespaces (i.e. I 
> > know I want unprivileged root, but I'm going to dynamically select 
> > the range to map based on what's currently available on the 
> > orchestration system).  If we stick with the @uid=X suffix, then 
> > dynamic mapping won't work because X is potentially different each 
> > time and there'll be a name mismatch in my xattrs.  Why not just 
> > make the suffix @uid, which means if root is mapped to any 
> > unprivileged uid then we pick this up otherwise we go with the
> > unsuffixed property?
> > 
> > As far as I can see there's no real advantage to discriminating 
> > userns specific xattrs based on where root is mapped to, unless 
> > there's a use case I'm missing?
> 
> Yes, the use case is: to allow root in the container to set the
> privilege itself, without endangering any resources not owned by
> that root.

OK, so you envisage the same filesystem being mounted in different user
namespaces and being able to see their own value for the xattr.  It
still seems a bit weird that they'd be able to change file contents and
have that seen by the other userns but not xattrs.

>   If you're going to have a root owned host-wide
> orchestration system setting up the rootfs, then you don't
> necessary need this at all.

I wasn't thinking it would be root owned, just that it would have a
predefined range of allowed uids and be able to map multiple containers
to subsets of these.

> As you say a @uid to say "any unprivileged userns" might be useful.
> The implication is that root on the host doesn't trust the image
> enough to write a real global file capability, but trusts it enough
> to 'endanger' all containers on the host.  If that's the case, I have
> no objection to adding this as a feature.

Yes, precisely.  The filesystem is certified as permitted to override
the xattr whatever unprivileged mapping for root is in place.

How would we effect the switch?  I suppose some global flag because I
can't see we'd be mixing use cases in a physical system.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23  0:13       ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23  0:13 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3404 bytes --]

On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> Quoting James Bottomley (James.Bottomley(a)HansenPartnership.com):
> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file 
> > > capabilities in user namespaces without affecting the file 
> > > capabilities that are effective on the host. This is to prevent 
> > > that any unprivileged user on the host maps his own uid to root 
> > > in a private namespace, writes the xattr, and executes the file
> > > with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a 
> > > different name when a user namespace is used. If for example the 
> > > root user in a user namespace writes the security.capability 
> > > xattr, the name of the xattr that is actually written is encoded 
> > > as security.capability(a)uid=1000 for root mapped to uid 1000 on 
> > > the host. When listing the xattrs on the host, the existing
> > > security.capability as well as the security.capability(a)uid=1000 
> > > will be shown. Inside the namespace only 'security.capability', 
> > > with the value of security.capability(a)uid=1000, is visible.
> > 
> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
> > this capability but am dynamically mapping the namespaces (i.e. I 
> > know I want unprivileged root, but I'm going to dynamically select 
> > the range to map based on what's currently available on the 
> > orchestration system).  If we stick with the @uid=X suffix, then 
> > dynamic mapping won't work because X is potentially different each 
> > time and there'll be a name mismatch in my xattrs.  Why not just 
> > make the suffix @uid, which means if root is mapped to any 
> > unprivileged uid then we pick this up otherwise we go with the
> > unsuffixed property?
> > 
> > As far as I can see there's no real advantage to discriminating 
> > userns specific xattrs based on where root is mapped to, unless 
> > there's a use case I'm missing?
> 
> Yes, the use case is: to allow root in the container to set the
> privilege itself, without endangering any resources not owned by
> that root.

OK, so you envisage the same filesystem being mounted in different user
namespaces and being able to see their own value for the xattr.  It
still seems a bit weird that they'd be able to change file contents and
have that seen by the other userns but not xattrs.

>   If you're going to have a root owned host-wide
> orchestration system setting up the rootfs, then you don't
> necessary need this at all.

I wasn't thinking it would be root owned, just that it would have a
predefined range of allowed uids and be able to map multiple containers
to subsets of these.

> As you say a @uid to say "any unprivileged userns" might be useful.
> The implication is that root on the host doesn't trust the image
> enough to write a real global file capability, but trusts it enough
> to 'endanger' all containers on the host.  If that's the case, I have
> no objection to adding this as a feature.

Yes, precisely.  The filesystem is certified as permitted to override
the xattr whatever unprivileged mapping for root is in place.

How would we effect the switch?  I suppose some global flag because I
can't see we'd be mixing use cases in a physical system.

James


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]       ` <1498176787.7636.11.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-06-23  1:19         ` Serge E. Hallyn
  2017-06-23 17:37         ` Eric W. Biederman
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23  1:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: lkp-JC7UmRfGjtg,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Quoting James Bottomley (James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org):
> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> > Yes, the use case is: to allow root in the container to set the
> > privilege itself, without endangering any resources not owned by
> > that root.
> 
> OK, so you envisage the same filesystem being mounted in different user
> namespaces

Well no - in lxd we have a separate filesystem for each container.
The filesystems are not shared.

>  and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

Not sure what you mean.  If they have privilege over the inode, they
can write a xattr targeted at their own root userid.

> >   If you're going to have a root owned host-wide
> > orchestration system setting up the rootfs, then you don't
> > necessary need this at all.
> 
> I wasn't thinking it would be root owned, just that it would have a
> predefined range of allowed uids and be able to map multiple containers
> to subsets of these.

Hm. In that case they should not be allowed to write your proposed
'security.capability@uid' capability, because that would also grant
capabilities over subuids which they were not delegated.

(but see below)

> > As you say a @uid to say "any unprivileged userns" might be useful.
> > The implication is that root on the host doesn't trust the image
> > enough to write a real global file capability, but trusts it enough
> > to 'endanger' all containers on the host.  If that's the case, I have
> > no objection to adding this as a feature.
> 
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
> 
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

I might be confused.  But thought CAP_SETFCAP against init_user_ns would
be required to set 'security.capability@uid'.  That, or you could create
a user namespace mapping [ 1 - 4294967295 ] to [ 0 = 4294967294 ], and
have CAP_SETFCAP against that namespace.  Which would allow you to run
without host root privilege.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23  0:13       ` James Bottomley
@ 2017-06-23  1:19         ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23  1:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, zohar, containers, linux-kernel, xiaolong.ye,
	linux-security-module, ebiederm, lkp

Quoting James Bottomley (James.Bottomley@HansenPartnership.com):
> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> > Yes, the use case is: to allow root in the container to set the
> > privilege itself, without endangering any resources not owned by
> > that root.
> 
> OK, so you envisage the same filesystem being mounted in different user
> namespaces

Well no - in lxd we have a separate filesystem for each container.
The filesystems are not shared.

>  and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

Not sure what you mean.  If they have privilege over the inode, they
can write a xattr targeted at their own root userid.

> >   If you're going to have a root owned host-wide
> > orchestration system setting up the rootfs, then you don't
> > necessary need this at all.
> 
> I wasn't thinking it would be root owned, just that it would have a
> predefined range of allowed uids and be able to map multiple containers
> to subsets of these.

Hm. In that case they should not be allowed to write your proposed
'security.capability@uid' capability, because that would also grant
capabilities over subuids which they were not delegated.

(but see below)

> > As you say a @uid to say "any unprivileged userns" might be useful.
> > The implication is that root on the host doesn't trust the image
> > enough to write a real global file capability, but trusts it enough
> > to 'endanger' all containers on the host.  If that's the case, I have
> > no objection to adding this as a feature.
> 
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
> 
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

I might be confused.  But thought CAP_SETFCAP against init_user_ns would
be required to set 'security.capability@uid'.  That, or you could create
a user namespace mapping [ 1 - 4294967295 ] to [ 0 = 4294967294 ], and
have CAP_SETFCAP against that namespace.  Which would allow you to run
without host root privilege.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23  1:19         ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23  1:19 UTC (permalink / raw)
  To: linux-security-module

Quoting James Bottomley (James.Bottomley at HansenPartnership.com):
> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
> > Yes, the use case is: to allow root in the container to set the
> > privilege itself, without endangering any resources not owned by
> > that root.
> 
> OK, so you envisage the same filesystem being mounted in different user
> namespaces

Well no - in lxd we have a separate filesystem for each container.
The filesystems are not shared.

>  and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

Not sure what you mean.  If they have privilege over the inode, they
can write a xattr targeted at their own root userid.

> >   If you're going to have a root owned host-wide
> > orchestration system setting up the rootfs, then you don't
> > necessary need this at all.
> 
> I wasn't thinking it would be root owned, just that it would have a
> predefined range of allowed uids and be able to map multiple containers
> to subsets of these.

Hm. In that case they should not be allowed to write your proposed
'security.capability at uid' capability, because that would also grant
capabilities over subuids which they were not delegated.

(but see below)

> > As you say a @uid to say "any unprivileged userns" might be useful.
> > The implication is that root on the host doesn't trust the image
> > enough to write a real global file capability, but trusts it enough
> > to 'endanger' all containers on the host.  If that's the case, I have
> > no objection to adding this as a feature.
> 
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
> 
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

I might be confused.  But thought CAP_SETFCAP against init_user_ns would
be required to set 'security.capability at uid'.  That, or you could create
a user namespace mapping [ 1 - 4294967295 ] to [ 0 = 4294967294 ], and
have CAP_SETFCAP against that namespace.  Which would allow you to run
without host root privilege.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 18:59 ` Stefan Berger
  (?)
@ 2017-06-23  7:01     ` Amir Goldstein
  -1 siblings, 0 replies; 180+ messages in thread
From: Amir Goldstein @ 2017-06-23  7:01 UTC (permalink / raw)
  To: Stefan Berger
  Cc: lkp-JC7UmRfGjtg, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, Mimi Zohar

On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
<stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.
>

Am I the only one who thinks that suffix is perhaps not the best grammar
to use for this namespace?
xattrs are clearly namespaced by prefix, so it seems right to me to keep
it that way - define a new special xattr namespace "ns" and only if that
prefix exists, the @uid suffix will be parsed.
This could be either  ns.security.capability@uid=1000 or
ns@uid=1000.security.capability. The latter seems more correct to me,
because then we will be able to namespace any xattr without having to
protect from "unprivileged xattr injection", i.e.:
setfattr -n "user.whatever.foo@uid=0"

Amir.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23  7:01     ` Amir Goldstein
  0 siblings, 0 replies; 180+ messages in thread
From: Amir Goldstein @ 2017-06-23  7:01 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Serge E. Hallyn, Tycho Andersen,
	James Bottomley, christian.brauner, Vivek Goyal, LSM List

On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
<stefanb@linux.vnet.ibm.com> wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.
>

Am I the only one who thinks that suffix is perhaps not the best grammar
to use for this namespace?
xattrs are clearly namespaced by prefix, so it seems right to me to keep
it that way - define a new special xattr namespace "ns" and only if that
prefix exists, the @uid suffix will be parsed.
This could be either  ns.security.capability@uid=1000 or
ns@uid=1000.security.capability. The latter seems more correct to me,
because then we will be able to namespace any xattr without having to
protect from "unprivileged xattr injection", i.e.:
setfattr -n "user.whatever.foo@uid=0"

Amir.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23  7:01     ` Amir Goldstein
  0 siblings, 0 replies; 180+ messages in thread
From: Amir Goldstein @ 2017-06-23  7:01 UTC (permalink / raw)
  To: linux-security-module

On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
<stefanb@linux.vnet.ibm.com> wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
>
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability at uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability at uid=1000, is visible.
>

Am I the only one who thinks that suffix is perhaps not the best grammar
to use for this namespace?
xattrs are clearly namespaced by prefix, so it seems right to me to keep
it that way - define a new special xattr namespace "ns" and only if that
prefix exists, the @uid suffix will be parsed.
This could be either  ns.security.capability at uid=1000 or
ns at uid=1000.security.capability. The latter seems more correct to me,
because then we will be able to namespace any xattr without having to
protect from "unprivileged xattr injection", i.e.:
setfattr -n "user.whatever.foo at uid=0"

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23  7:01     ` Amir Goldstein
  (?)
  (?)
@ 2017-06-23 16:00     ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 16:00 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, lkp-JC7UmRfGjtg

Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> >
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> >
> 
> Am I the only one who thinks that suffix is perhaps not the best grammar
> to use for this namespace?

You're the only one to have mentioned it so far.

> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> it that way - define a new special xattr namespace "ns" and only if that
> prefix exists, the @uid suffix will be parsed.
> This could be either  ns.security.capability@uid=1000 or
> ns@uid=1000.security.capability. The latter seems more correct to me,
> because then we will be able to namespace any xattr without having to
> protect from "unprivileged xattr injection", i.e.:
> setfattr -n "user.whatever.foo@uid=0"

I like it for simplifying the parser code.  One concern I have is that,
since ns.* is currently not gated, one could write ns.* on an older
kernel and then exploit it on a newer one.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23  7:01     ` Amir Goldstein
@ 2017-06-23 16:00       ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 16:00 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Stefan Berger, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Serge E. Hallyn,
	Tycho Andersen, James Bottomley, christian.brauner, Vivek Goyal,
	LSM List

Quoting Amir Goldstein (amir73il@gmail.com):
> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> <stefanb@linux.vnet.ibm.com> wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> >
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> >
> 
> Am I the only one who thinks that suffix is perhaps not the best grammar
> to use for this namespace?

You're the only one to have mentioned it so far.

> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> it that way - define a new special xattr namespace "ns" and only if that
> prefix exists, the @uid suffix will be parsed.
> This could be either  ns.security.capability@uid=1000 or
> ns@uid=1000.security.capability. The latter seems more correct to me,
> because then we will be able to namespace any xattr without having to
> protect from "unprivileged xattr injection", i.e.:
> setfattr -n "user.whatever.foo@uid=0"

I like it for simplifying the parser code.  One concern I have is that,
since ns.* is currently not gated, one could write ns.* on an older
kernel and then exploit it on a newer one.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:00       ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 16:00 UTC (permalink / raw)
  To: linux-security-module

Quoting Amir Goldstein (amir73il at gmail.com):
> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> <stefanb@linux.vnet.ibm.com> wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> >
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability at uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability at uid=1000, is visible.
> >
> 
> Am I the only one who thinks that suffix is perhaps not the best grammar
> to use for this namespace?

You're the only one to have mentioned it so far.

> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> it that way - define a new special xattr namespace "ns" and only if that
> prefix exists, the @uid suffix will be parsed.
> This could be either  ns.security.capability at uid=1000 or
> ns at uid=1000.security.capability. The latter seems more correct to me,
> because then we will be able to namespace any xattr without having to
> protect from "unprivileged xattr injection", i.e.:
> setfattr -n "user.whatever.foo at uid=0"

I like it for simplifying the parser code.  One concern I have is that,
since ns.* is currently not gated, one could write ns.* on an older
kernel and then exploit it on a newer one.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 16:00       ` Serge E. Hallyn
  (?)
  (?)
@ 2017-06-23 16:16           ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:16 UTC (permalink / raw)
  To: Serge E. Hallyn, Amir Goldstein
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, lkp-JC7UmRfGjtg

On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability@uid=1000, is visible.
>>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
> You're the only one to have mentioned it so far.
>
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability@uid=1000 or
>> ns@uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo@uid=0"
> I like it for simplifying the parser code.  One concern I have is that,
> since ns.* is currently not gated, one could write ns.* on an older
> kernel and then exploit it on a newer one.

security.ns.capability@uid=1000, then?

Or maybe just security.ns.capability, taking James' comment into account.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:16           ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:16 UTC (permalink / raw)
  To: Serge E. Hallyn, Amir Goldstein
  Cc: Stefan Berger, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	James Bottomley, christian.brauner, Vivek Goyal, LSM List

On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> Quoting Amir Goldstein (amir73il@gmail.com):
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb@linux.vnet.ibm.com> wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability@uid=1000, is visible.
>>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
> You're the only one to have mentioned it so far.
>
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability@uid=1000 or
>> ns@uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo@uid=0"
> I like it for simplifying the parser code.  One concern I have is that,
> since ns.* is currently not gated, one could write ns.* on an older
> kernel and then exploit it on a newer one.

security.ns.capability@uid=1000, then?

Or maybe just security.ns.capability, taking James' comment into account.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:16           ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:16 UTC (permalink / raw)
  To: linux-security-module

On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> Quoting Amir Goldstein (amir73il at gmail.com):
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb@linux.vnet.ibm.com> wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability at uid=1000, is visible.
>>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
> You're the only one to have mentioned it so far.
>
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability at uid=1000 or
>> ns at uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo at uid=0"
> I like it for simplifying the parser code.  One concern I have is that,
> since ns.* is currently not gated, one could write ns.* on an older
> kernel and then exploit it on a newer one.

security.ns.capability at uid=1000, then?

Or maybe just security.ns.capability, taking James' comment into account.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:16           ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:16 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2386 bytes --]

On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> Quoting Amir Goldstein (amir73il(a)gmail.com):
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb@linux.vnet.ibm.com> wrote:
>>> This series of patches primary goal is to enable file capabilities
>>> in user namespaces without affecting the file capabilities that are
>>> effective on the host. This is to prevent that any unprivileged user
>>> on the host maps his own uid to root in a private namespace, writes
>>> the xattr, and executes the file with privilege on the host.
>>>
>>> We achieve this goal by writing extended attributes with a different
>>> name when a user namespace is used. If for example the root user
>>> in a user namespace writes the security.capability xattr, the name
>>> of the xattr that is actually written is encoded as
>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>> When listing the xattrs on the host, the existing security.capability
>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>> namespace only 'security.capability', with the value of
>>> security.capability(a)uid=1000, is visible.
>>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
> You're the only one to have mentioned it so far.
>
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability(a)uid=1000 or
>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo(a)uid=0"
> I like it for simplifying the parser code.  One concern I have is that,
> since ns.* is currently not gated, one could write ns.* on an older
> kernel and then exploit it on a newer one.

security.ns.capability(a)uid=1000, then?

Or maybe just security.ns.capability, taking James' comment into account.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]           ` <aa62373e-7cd6-39dd-2e38-2b6d6dbe18a8-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
@ 2017-06-23 16:30             ` Serge E. Hallyn
  2017-06-23 18:08             ` Stefan Berger
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 16:30 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	lkp-JC7UmRfGjtg

Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> > Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> >> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> >> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> >>> This series of patches primary goal is to enable file capabilities
> >>> in user namespaces without affecting the file capabilities that are
> >>> effective on the host. This is to prevent that any unprivileged user
> >>> on the host maps his own uid to root in a private namespace, writes
> >>> the xattr, and executes the file with privilege on the host.
> >>>
> >>> We achieve this goal by writing extended attributes with a different
> >>> name when a user namespace is used. If for example the root user
> >>> in a user namespace writes the security.capability xattr, the name
> >>> of the xattr that is actually written is encoded as
> >>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >>> When listing the xattrs on the host, the existing security.capability
> >>> as well as the security.capability@uid=1000 will be shown. Inside the
> >>> namespace only 'security.capability', with the value of
> >>> security.capability@uid=1000, is visible.
> >>>
> >> Am I the only one who thinks that suffix is perhaps not the best grammar
> >> to use for this namespace?
> > You're the only one to have mentioned it so far.
> >
> >> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> >> it that way - define a new special xattr namespace "ns" and only if that
> >> prefix exists, the @uid suffix will be parsed.
> >> This could be either  ns.security.capability@uid=1000 or
> >> ns@uid=1000.security.capability. The latter seems more correct to me,
> >> because then we will be able to namespace any xattr without having to
> >> protect from "unprivileged xattr injection", i.e.:
> >> setfattr -n "user.whatever.foo@uid=0"
> > I like it for simplifying the parser code.  One concern I have is that,
> > since ns.* is currently not gated, one could write ns.* on an older
> > kernel and then exploit it on a newer one.
> 
> security.ns.capability@uid=1000, then?

That loses the advantage of simpler parsing though.  (Really it's not much
of a simplification anyway.)  So I'm not sure what advantage remains.

> Or maybe just security.ns.capability, taking James' comment into account.

That last one may be suitable as an option, useful for his particular
(somewhat barbaric :) use case, but it's not ok for the general solution.

If uid 1000 was delegated the subuids 100000-199999, it should be able
to write a file capability for use by his subuids, but that file capability
must not apply to other subuids.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 16:16           ` Casey Schaufler
@ 2017-06-23 16:30             ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 16:30 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Serge E. Hallyn, Amir Goldstein, Stefan Berger,
	Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Tycho Andersen, James Bottomley,
	christian.brauner, Vivek Goyal, LSM List

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> > Quoting Amir Goldstein (amir73il@gmail.com):
> >> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> >> <stefanb@linux.vnet.ibm.com> wrote:
> >>> This series of patches primary goal is to enable file capabilities
> >>> in user namespaces without affecting the file capabilities that are
> >>> effective on the host. This is to prevent that any unprivileged user
> >>> on the host maps his own uid to root in a private namespace, writes
> >>> the xattr, and executes the file with privilege on the host.
> >>>
> >>> We achieve this goal by writing extended attributes with a different
> >>> name when a user namespace is used. If for example the root user
> >>> in a user namespace writes the security.capability xattr, the name
> >>> of the xattr that is actually written is encoded as
> >>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >>> When listing the xattrs on the host, the existing security.capability
> >>> as well as the security.capability@uid=1000 will be shown. Inside the
> >>> namespace only 'security.capability', with the value of
> >>> security.capability@uid=1000, is visible.
> >>>
> >> Am I the only one who thinks that suffix is perhaps not the best grammar
> >> to use for this namespace?
> > You're the only one to have mentioned it so far.
> >
> >> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> >> it that way - define a new special xattr namespace "ns" and only if that
> >> prefix exists, the @uid suffix will be parsed.
> >> This could be either  ns.security.capability@uid=1000 or
> >> ns@uid=1000.security.capability. The latter seems more correct to me,
> >> because then we will be able to namespace any xattr without having to
> >> protect from "unprivileged xattr injection", i.e.:
> >> setfattr -n "user.whatever.foo@uid=0"
> > I like it for simplifying the parser code.  One concern I have is that,
> > since ns.* is currently not gated, one could write ns.* on an older
> > kernel and then exploit it on a newer one.
> 
> security.ns.capability@uid=1000, then?

That loses the advantage of simpler parsing though.  (Really it's not much
of a simplification anyway.)  So I'm not sure what advantage remains.

> Or maybe just security.ns.capability, taking James' comment into account.

That last one may be suitable as an option, useful for his particular
(somewhat barbaric :) use case, but it's not ok for the general solution.

If uid 1000 was delegated the subuids 100000-199999, it should be able
to write a file capability for use by his subuids, but that file capability
must not apply to other subuids.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:30             ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 16:30 UTC (permalink / raw)
  To: linux-security-module

Quoting Casey Schaufler (casey at schaufler-ca.com):
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> > Quoting Amir Goldstein (amir73il at gmail.com):
> >> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> >> <stefanb@linux.vnet.ibm.com> wrote:
> >>> This series of patches primary goal is to enable file capabilities
> >>> in user namespaces without affecting the file capabilities that are
> >>> effective on the host. This is to prevent that any unprivileged user
> >>> on the host maps his own uid to root in a private namespace, writes
> >>> the xattr, and executes the file with privilege on the host.
> >>>
> >>> We achieve this goal by writing extended attributes with a different
> >>> name when a user namespace is used. If for example the root user
> >>> in a user namespace writes the security.capability xattr, the name
> >>> of the xattr that is actually written is encoded as
> >>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> >>> When listing the xattrs on the host, the existing security.capability
> >>> as well as the security.capability at uid=1000 will be shown. Inside the
> >>> namespace only 'security.capability', with the value of
> >>> security.capability at uid=1000, is visible.
> >>>
> >> Am I the only one who thinks that suffix is perhaps not the best grammar
> >> to use for this namespace?
> > You're the only one to have mentioned it so far.
> >
> >> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> >> it that way - define a new special xattr namespace "ns" and only if that
> >> prefix exists, the @uid suffix will be parsed.
> >> This could be either  ns.security.capability at uid=1000 or
> >> ns at uid=1000.security.capability. The latter seems more correct to me,
> >> because then we will be able to namespace any xattr without having to
> >> protect from "unprivileged xattr injection", i.e.:
> >> setfattr -n "user.whatever.foo at uid=0"
> > I like it for simplifying the parser code.  One concern I have is that,
> > since ns.* is currently not gated, one could write ns.* on an older
> > kernel and then exploit it on a newer one.
> 
> security.ns.capability at uid=1000, then?

That loses the advantage of simpler parsing though.  (Really it's not much
of a simplification anyway.)  So I'm not sure what advantage remains.

> Or maybe just security.ns.capability, taking James' comment into account.

That last one may be suitable as an option, useful for his particular
(somewhat barbaric :) use case, but it's not ok for the general solution.

If uid 1000 was delegated the subuids 100000-199999, it should be able
to write a file capability for use by his subuids, but that file capability
must not apply to other subuids.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]             ` <20170623163030.GA18820-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-23 16:53               ` Casey Schaufler
  2017-06-23 17:07               ` James Bottomley
  1 sibling, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:53 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	lkp-JC7UmRfGjtg

On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>> Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>> When listing the xattrs on the host, the existing security.capability
>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>> namespace only 'security.capability', with the value of
>>>>> security.capability@uid=1000, is visible.
>>>>>
>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>> to use for this namespace?
>>> You're the only one to have mentioned it so far.
>>>
>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>> prefix exists, the @uid suffix will be parsed.
>>>> This could be either  ns.security.capability@uid=1000 or
>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>> because then we will be able to namespace any xattr without having to
>>>> protect from "unprivileged xattr injection", i.e.:
>>>> setfattr -n "user.whatever.foo@uid=0"
>>> I like it for simplifying the parser code.  One concern I have is that,
>>> since ns.* is currently not gated, one could write ns.* on an older
>>> kernel and then exploit it on a newer one.
>> security.ns.capability@uid=1000, then?
> That loses the advantage of simpler parsing though.  (Really it's not much
> of a simplification anyway.)  So I'm not sure what advantage remains.
>
>> Or maybe just security.ns.capability, taking James' comment into account.
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general solution.

security.ns@uid=100.capability

It makes the namespace part explicit and separate from
the rest of the attribute name. It also generalizes for
other attributes.

security.ns@uid=1000@smack=WestOfOne.SMACK64

> If uid 1000 was delegated the subuids 100000-199999, it should be able
> to write a file capability for use by his subuids, but that file capability
> must not apply to other subuids.
>
> -serge
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 16:30             ` Serge E. Hallyn
  (?)
@ 2017-06-23 16:53               ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:53 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Amir Goldstein, Stefan Berger, Eric W. Biederman,
	Linux Containers, lkp, xiaolong.ye, linux-kernel, Mimi Zohar,
	Tycho Andersen, James Bottomley, christian.brauner, Vivek Goyal,
	LSM List

On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey@schaufler-ca.com):
>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>> Quoting Amir Goldstein (amir73il@gmail.com):
>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>> When listing the xattrs on the host, the existing security.capability
>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>> namespace only 'security.capability', with the value of
>>>>> security.capability@uid=1000, is visible.
>>>>>
>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>> to use for this namespace?
>>> You're the only one to have mentioned it so far.
>>>
>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>> prefix exists, the @uid suffix will be parsed.
>>>> This could be either  ns.security.capability@uid=1000 or
>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>> because then we will be able to namespace any xattr without having to
>>>> protect from "unprivileged xattr injection", i.e.:
>>>> setfattr -n "user.whatever.foo@uid=0"
>>> I like it for simplifying the parser code.  One concern I have is that,
>>> since ns.* is currently not gated, one could write ns.* on an older
>>> kernel and then exploit it on a newer one.
>> security.ns.capability@uid=1000, then?
> That loses the advantage of simpler parsing though.  (Really it's not much
> of a simplification anyway.)  So I'm not sure what advantage remains.
>
>> Or maybe just security.ns.capability, taking James' comment into account.
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general solution.

security.ns@uid=100.capability

It makes the namespace part explicit and separate from
the rest of the attribute name. It also generalizes for
other attributes.

security.ns@uid=1000@smack=WestOfOne.SMACK64

> If uid 1000 was delegated the subuids 100000-199999, it should be able
> to write a file capability for use by his subuids, but that file capability
> must not apply to other subuids.
>
> -serge
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:53               ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:53 UTC (permalink / raw)
  To: linux-security-module

On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey at schaufler-ca.com):
>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>> Quoting Amir Goldstein (amir73il at gmail.com):
>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>>> When listing the xattrs on the host, the existing security.capability
>>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>>> namespace only 'security.capability', with the value of
>>>>> security.capability at uid=1000, is visible.
>>>>>
>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>> to use for this namespace?
>>> You're the only one to have mentioned it so far.
>>>
>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>> prefix exists, the @uid suffix will be parsed.
>>>> This could be either  ns.security.capability at uid=1000 or
>>>> ns at uid=1000.security.capability. The latter seems more correct to me,
>>>> because then we will be able to namespace any xattr without having to
>>>> protect from "unprivileged xattr injection", i.e.:
>>>> setfattr -n "user.whatever.foo at uid=0"
>>> I like it for simplifying the parser code.  One concern I have is that,
>>> since ns.* is currently not gated, one could write ns.* on an older
>>> kernel and then exploit it on a newer one.
>> security.ns.capability at uid=1000, then?
> That loses the advantage of simpler parsing though.  (Really it's not much
> of a simplification anyway.)  So I'm not sure what advantage remains.
>
>> Or maybe just security.ns.capability, taking James' comment into account.
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general solution.

security.ns at uid=100.capability

It makes the namespace part explicit and separate from
the rest of the attribute name. It also generalizes for
other attributes.

security.ns at uid=1000 at smack=WestOfOne.SMACK64

> If uid 1000 was delegated the subuids 100000-199999, it should be able
> to write a file capability for use by his subuids, but that file capability
> must not apply to other subuids.
>
> -serge
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 16:53               ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 16:53 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3069 bytes --]

On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey(a)schaufler-ca.com):
>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>> Quoting Amir Goldstein (amir73il(a)gmail.com):
>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>> This series of patches primary goal is to enable file capabilities
>>>>> in user namespaces without affecting the file capabilities that are
>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>> the xattr, and executes the file with privilege on the host.
>>>>>
>>>>> We achieve this goal by writing extended attributes with a different
>>>>> name when a user namespace is used. If for example the root user
>>>>> in a user namespace writes the security.capability xattr, the name
>>>>> of the xattr that is actually written is encoded as
>>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>>> When listing the xattrs on the host, the existing security.capability
>>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>>> namespace only 'security.capability', with the value of
>>>>> security.capability(a)uid=1000, is visible.
>>>>>
>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>> to use for this namespace?
>>> You're the only one to have mentioned it so far.
>>>
>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>> prefix exists, the @uid suffix will be parsed.
>>>> This could be either  ns.security.capability(a)uid=1000 or
>>>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>>>> because then we will be able to namespace any xattr without having to
>>>> protect from "unprivileged xattr injection", i.e.:
>>>> setfattr -n "user.whatever.foo(a)uid=0"
>>> I like it for simplifying the parser code.  One concern I have is that,
>>> since ns.* is currently not gated, one could write ns.* on an older
>>> kernel and then exploit it on a newer one.
>> security.ns.capability(a)uid=1000, then?
> That loses the advantage of simpler parsing though.  (Really it's not much
> of a simplification anyway.)  So I'm not sure what advantage remains.
>
>> Or maybe just security.ns.capability, taking James' comment into account.
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general solution.

security.ns(a)uid=100.capability

It makes the namespace part explicit and separate from
the rest of the attribute name. It also generalizes for
other attributes.

security.ns(a)uid=1000(a)smack=WestOfOne.SMACK64

> If uid 1000 was delegated the subuids 100000-199999, it should be able
> to write a file capability for use by his subuids, but that file capability
> must not apply to other subuids.
>
> -serge
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]               ` <ef37880d-6baa-12a6-eab1-bcd0a4e94d53-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
@ 2017-06-23 17:01                 ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 17:01 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	lkp-JC7UmRfGjtg

Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> >> Or maybe just security.ns.capability, taking James' comment into account.
> > That last one may be suitable as an option, useful for his particular
> > (somewhat barbaric :) use case, but it's not ok for the general solution.
> 
> security.ns@uid=100.capability

I'm ok with this.  It gives protection from older kernels, and puts
the 'ns@uid=' at predictable locations for security and trusted.

> It makes the namespace part explicit and separate from
> the rest of the attribute name. It also generalizes for
> other attributes.
> 
> security.ns@uid=1000@smack=WestOfOne.SMACK64

Looks good to me.

Do we want to say that '.' ends the attribute list?  That of
course means '.' cannot be in the attributes.  Perhaps end
with '@@' instead?  Just a thought.

What do others think?

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 16:53               ` Casey Schaufler
@ 2017-06-23 17:01                 ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 17:01 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Serge E. Hallyn, Amir Goldstein, Stefan Berger,
	Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Tycho Andersen, James Bottomley,
	christian.brauner, Vivek Goyal, LSM List

Quoting Casey Schaufler (casey@schaufler-ca.com):
> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey@schaufler-ca.com):
> >> Or maybe just security.ns.capability, taking James' comment into account.
> > That last one may be suitable as an option, useful for his particular
> > (somewhat barbaric :) use case, but it's not ok for the general solution.
> 
> security.ns@uid=100.capability

I'm ok with this.  It gives protection from older kernels, and puts
the 'ns@uid=' at predictable locations for security and trusted.

> It makes the namespace part explicit and separate from
> the rest of the attribute name. It also generalizes for
> other attributes.
> 
> security.ns@uid=1000@smack=WestOfOne.SMACK64

Looks good to me.

Do we want to say that '.' ends the attribute list?  That of
course means '.' cannot be in the attributes.  Perhaps end
with '@@' instead?  Just a thought.

What do others think?

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:01                 ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 17:01 UTC (permalink / raw)
  To: linux-security-module

Quoting Casey Schaufler (casey at schaufler-ca.com):
> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey at schaufler-ca.com):
> >> Or maybe just security.ns.capability, taking James' comment into account.
> > That last one may be suitable as an option, useful for his particular
> > (somewhat barbaric :) use case, but it's not ok for the general solution.
> 
> security.ns at uid=100.capability

I'm ok with this.  It gives protection from older kernels, and puts
the 'ns at uid=' at predictable locations for security and trusted.

> It makes the namespace part explicit and separate from
> the rest of the attribute name. It also generalizes for
> other attributes.
> 
> security.ns at uid=1000 at smack=WestOfOne.SMACK64

Looks good to me.

Do we want to say that '.' ends the attribute list?  That of
course means '.' cannot be in the attributes.  Perhaps end
with '@@' instead?  Just a thought.

What do others think?

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]             ` <20170623163030.GA18820-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
  2017-06-23 16:53               ` Casey Schaufler
@ 2017-06-23 17:07               ` James Bottomley
  1 sibling, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23 17:07 UTC (permalink / raw)
  To: Serge E. Hallyn, Casey Schaufler
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, LSM List, Eric W. Biederman, lkp-JC7UmRfGjtg

On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> > Or maybe just security.ns.capability, taking James' comment into
> > account.
> 
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general
> solution.
> 
> If uid 1000 was delegated the subuids 100000-199999, it should be 
> able to write a file capability for use by his subuids, but that file
> capability must not apply to other subuids.

I don't think it's barbaric, I think it's the common use case.  Let me
give a more comprehensible answer in terms of docker and IMA.  Lets
suppose I'm running docker locally and in a test cloud both with userns
enabled.

I build an image locally, mapping my uid (1000) to root.  If I begin
with a standard base, each of the files has a security.ima signature. 
 Now I add my layer, which involves updating a file, so I need to write
a new signature to security.ima.  Because I'm running user namespaced,
the update gets written at security.ima@uid=1000 when I do a docker
save. 

Now supposing I deploy that image to a cloud.  As a tenant, the cloud
gives me real uid 4531 and maps that to root.  Execution of the binary
fails because it tries to use the underlying signature (in
security.ima) as there is no xattr named security.ima@uid=4531

So my essential point is that building the real kuid into the permanent
record of the xattr damages image portability, which is touted as one
of the real advantages of container images.

James

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 16:30             ` Serge E. Hallyn
  (?)
@ 2017-06-23 17:07               ` James Bottomley
  -1 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23 17:07 UTC (permalink / raw)
  To: Serge E. Hallyn, Casey Schaufler
  Cc: Amir Goldstein, Stefan Berger, Eric W. Biederman,
	Linux Containers, lkp, xiaolong.ye, linux-kernel, Mimi Zohar,
	Tycho Andersen, christian.brauner, Vivek Goyal, LSM List

On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey@schaufler-ca.com):
> > Or maybe just security.ns.capability, taking James' comment into
> > account.
> 
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general
> solution.
> 
> If uid 1000 was delegated the subuids 100000-199999, it should be 
> able to write a file capability for use by his subuids, but that file
> capability must not apply to other subuids.

I don't think it's barbaric, I think it's the common use case.  Let me
give a more comprehensible answer in terms of docker and IMA.  Lets
suppose I'm running docker locally and in a test cloud both with userns
enabled.

I build an image locally, mapping my uid (1000) to root.  If I begin
with a standard base, each of the files has a security.ima signature. 
 Now I add my layer, which involves updating a file, so I need to write
a new signature to security.ima.  Because I'm running user namespaced,
the update gets written at security.ima@uid=1000 when I do a docker
save. 

Now supposing I deploy that image to a cloud.  As a tenant, the cloud
gives me real uid 4531 and maps that to root.  Execution of the binary
fails because it tries to use the underlying signature (in
security.ima) as there is no xattr named security.ima@uid=4531

So my essential point is that building the real kuid into the permanent
record of the xattr damages image portability, which is touted as one
of the real advantages of container images.

James

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:07               ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23 17:07 UTC (permalink / raw)
  To: linux-security-module

On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey at schaufler-ca.com):
> > Or maybe just security.ns.capability, taking James' comment into
> > account.
> 
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general
> solution.
> 
> If uid 1000 was delegated the subuids 100000-199999, it should be 
> able to write a file capability for use by his subuids, but that file
> capability must not apply to other subuids.

I don't think it's barbaric, I think it's the common use case.  Let me
give a more comprehensible answer in terms of docker and IMA.  Lets
suppose I'm running docker locally and in a test cloud both with userns
enabled.

I build an image locally, mapping my uid (1000) to root.  If I begin
with a standard base, each of the files has a security.ima signature. 
 Now I add my layer, which involves updating a file, so I need to write
a new signature to security.ima.  Because I'm running user namespaced,
the update gets written at security.ima at uid=1000 when I do a docker
save. 

Now supposing I deploy that image to a cloud.  As a tenant, the cloud
gives me real uid 4531 and maps that to root.  Execution of the binary
fails because it tries to use the underlying signature (in
security.ima) as there is no xattr named security.ima at uid=4531

So my essential point is that building the real kuid into the permanent
record of the xattr damages image portability, which is touted as one
of the real advantages of container images.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:07               ` James Bottomley
  0 siblings, 0 replies; 180+ messages in thread
From: James Bottomley @ 2017-06-23 17:07 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]

On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> Quoting Casey Schaufler (casey(a)schaufler-ca.com):
> > Or maybe just security.ns.capability, taking James' comment into
> > account.
> 
> That last one may be suitable as an option, useful for his particular
> (somewhat barbaric :) use case, but it's not ok for the general
> solution.
> 
> If uid 1000 was delegated the subuids 100000-199999, it should be 
> able to write a file capability for use by his subuids, but that file
> capability must not apply to other subuids.

I don't think it's barbaric, I think it's the common use case.  Let me
give a more comprehensible answer in terms of docker and IMA.  Lets
suppose I'm running docker locally and in a test cloud both with userns
enabled.

I build an image locally, mapping my uid (1000) to root.  If I begin
with a standard base, each of the files has a security.ima signature. 
 Now I add my layer, which involves updating a file, so I need to write
a new signature to security.ima.  Because I'm running user namespaced,
the update gets written at security.ima(a)uid=1000 when I do a docker
save. 

Now supposing I deploy that image to a cloud.  As a tenant, the cloud
gives me real uid 4531 and maps that to root.  Execution of the binary
fails because it tries to use the underlying signature (in
security.ima) as there is no xattr named security.ima(a)uid=4531

So my essential point is that building the real kuid into the permanent
record of the xattr damages image portability, which is touted as one
of the real advantages of container images.

James


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]               ` <1498237641.3641.15.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-06-23 17:18                 ` Aleksa Sarai
       [not found]                   ` <b57803da-0e8b-594d-901b-12eb509f04b5-l3A5Bk7waGM@public.gmane.org>
  2017-06-23 17:20                   ` Serge E. Hallyn
  2017-06-23 17:38                 ` Stefan Berger
  2 siblings, 1 reply; 180+ messages in thread
From: Aleksa Sarai @ 2017-06-23 17:18 UTC (permalink / raw)
  To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

>>> Or maybe just security.ns.capability, taking James' comment into
>>> account.
>>
>> That last one may be suitable as an option, useful for his particular
>> (somewhat barbaric :) use case, but it's not ok for the general
>> solution.
>>
>> If uid 1000 was delegated the subuids 100000-199999, it should be
>> able to write a file capability for use by his subuids, but that file
>> capability must not apply to other subuids.

[As an aside, I think the delegation model of sub[ug]id is not the right 
way of thinking about these features. The properties of a file 
capability should be considered based on the user namespace it is being 
evaluated in, not who set the capability or delegated the [ug]ids. I 
might be misunderstanding the problem though.]

> I don't think it's barbaric, I think it's the common use case.  Let me
> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
> 
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature.
>   Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima@uid=1000 when I do a docker
> save.
> 
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima@uid=4531
> 
> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.

Definitely. This is somewhat of a similar issue to shiftfs (that the 
{uid,gid} being hardcoded in images makes it very difficult to make user 
namespaced containers work sanely).

Since OCI images are all basically a set of tar archives with JSON 
manifests, to make things like this work nicely for container images it 
just has to avoid hardcoding kuid (or even userspace uids) into 
attributes. It is possible for us to write special-case translation code 
for images, but that will cause the same issues that shiftfs is trying 
to resolve.

-- 
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 17:07               ` James Bottomley
  (?)
@ 2017-06-23 17:20                   ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 17:20 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, LSM List, Eric W. Biederman, Casey Schaufler,
	lkp-JC7UmRfGjtg

Quoting James Bottomley (James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org):
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> > > Or maybe just security.ns.capability, taking James' comment into
> > > account.
> > 
> > That last one may be suitable as an option, useful for his particular
> > (somewhat barbaric :) use case, but it's not ok for the general
> > solution.
> > 
> > If uid 1000 was delegated the subuids 100000-199999, it should be 
> > able to write a file capability for use by his subuids, but that file
> > capability must not apply to other subuids.
> 
> I don't think it's barbaric, I think it's the common use case.  Let me

:)  sorry.  Yes, it is the common case, and even lxd does it that way.
But lxc itself does not, and while there are shortcomings (including
this one, file capabilities) which require 'barbaric' use of privilege
to set things up in some cases, I prefer we not get complacent and accept
it as proper.

> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
> 
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature. 
>  Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima@uid=1000 when I do a docker
> save. 
> 
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima@uid=4531

In this example, how do you, if you do, shift the owner of the file
into the mapped user namespace?  Or are you happy to have the file owned
by an invalid user nobody?  (There certainly are cases where that would
be ok, but I suspect you're shifting the file)

> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.

'container images' aren't portable in that sense now - for at least
many cases - because you have to shift the uid.  However you're doing
that, you may be able to shift the xattr the same way.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:20                   ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 17:20 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, Casey Schaufler, Amir Goldstein, Stefan Berger,
	Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Tycho Andersen, christian.brauner,
	Vivek Goyal, LSM List

Quoting James Bottomley (James.Bottomley@HansenPartnership.com):
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey@schaufler-ca.com):
> > > Or maybe just security.ns.capability, taking James' comment into
> > > account.
> > 
> > That last one may be suitable as an option, useful for his particular
> > (somewhat barbaric :) use case, but it's not ok for the general
> > solution.
> > 
> > If uid 1000 was delegated the subuids 100000-199999, it should be 
> > able to write a file capability for use by his subuids, but that file
> > capability must not apply to other subuids.
> 
> I don't think it's barbaric, I think it's the common use case.  Let me

:)  sorry.  Yes, it is the common case, and even lxd does it that way.
But lxc itself does not, and while there are shortcomings (including
this one, file capabilities) which require 'barbaric' use of privilege
to set things up in some cases, I prefer we not get complacent and accept
it as proper.

> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
> 
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature. 
>  Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima@uid=1000 when I do a docker
> save. 
> 
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima@uid=4531

In this example, how do you, if you do, shift the owner of the file
into the mapped user namespace?  Or are you happy to have the file owned
by an invalid user nobody?  (There certainly are cases where that would
be ok, but I suspect you're shifting the file)

> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.

'container images' aren't portable in that sense now - for at least
many cases - because you have to shift the uid.  However you're doing
that, you may be able to shift the xattr the same way.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:20                   ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 17:20 UTC (permalink / raw)
  To: linux-security-module

Quoting James Bottomley (James.Bottomley at HansenPartnership.com):
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> > Quoting Casey Schaufler (casey at schaufler-ca.com):
> > > Or maybe just security.ns.capability, taking James' comment into
> > > account.
> > 
> > That last one may be suitable as an option, useful for his particular
> > (somewhat barbaric :) use case, but it's not ok for the general
> > solution.
> > 
> > If uid 1000 was delegated the subuids 100000-199999, it should be 
> > able to write a file capability for use by his subuids, but that file
> > capability must not apply to other subuids.
> 
> I don't think it's barbaric, I think it's the common use case.  Let me

:)  sorry.  Yes, it is the common case, and even lxd does it that way.
But lxc itself does not, and while there are shortcomings (including
this one, file capabilities) which require 'barbaric' use of privilege
to set things up in some cases, I prefer we not get complacent and accept
it as proper.

> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
> 
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature. 
>  Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima at uid=1000 when I do a docker
> save. 
> 
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima at uid=4531

In this example, how do you, if you do, shift the owner of the file
into the mapped user namespace?  Or are you happy to have the file owned
by an invalid user nobody?  (There certainly are cases where that would
be ok, but I suspect you're shifting the file)

> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.

'container images' aren't portable in that sense now - for at least
many cases - because you have to shift the uid.  However you're doing
that, you may be able to shift the xattr the same way.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                   ` <20170623172016.GA19551-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-23 17:28                     ` Aleksa Sarai
       [not found]                       ` <553a72c4-eda9-52d6-2ae2-f8687c0c7c70-l3A5Bk7waGM@public.gmane.org>
  0 siblings, 1 reply; 180+ messages in thread
From: Aleksa Sarai @ 2017-06-23 17:28 UTC (permalink / raw)
  To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

>> So my essential point is that building the real kuid into the permanent
>> record of the xattr damages image portability, which is touted as one
>> of the real advantages of container images.
> 
> 'container images' aren't portable in that sense now - for at least
> many cases - because you have to shift the uid.  However you're doing
> that, you may be able to shift the xattr the same way.

Piling more things on top of that issue isn't going to make the issue 
easier to solve IMO. Would shiftfs or shift-bindmounts also have to do 
translation of arbitrary xattrs? Plus I would think that handling xattrs 
would be harder than {u,g}ids because the image unpacker now has to be 
aware of all xattrs that require remapping (Which might be an 
ever-growing list).

The user namespace incompatibility with the VFS's hard-coding of 
k{u,g}id values in inodes is an issue that we really shouldn't be 
encouraging IMO [especially given how hard it's been so far to solve 
that problem.]

Just my $.02.

-- 
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]       ` <1498176787.7636.11.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-06-23  1:19         ` Serge E. Hallyn
@ 2017-06-23 17:37         ` Eric W. Biederman
  1 sibling, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:37 UTC (permalink / raw)
  To: James Bottomley
  Cc: lkp-JC7UmRfGjtg,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org> writes:

> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
>> Quoting James Bottomley (James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org):
>> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> > > This series of patches primary goal is to enable file 
>> > > capabilities in user namespaces without affecting the file 
>> > > capabilities that are effective on the host. This is to prevent 
>> > > that any unprivileged user on the host maps his own uid to root 
>> > > in a private namespace, writes the xattr, and executes the file
>> > > with privilege on the host.
>> > > 
>> > > We achieve this goal by writing extended attributes with a 
>> > > different name when a user namespace is used. If for example the 
>> > > root user in a user namespace writes the security.capability 
>> > > xattr, the name of the xattr that is actually written is encoded 
>> > > as security.capability@uid=1000 for root mapped to uid 1000 on 
>> > > the host. When listing the xattrs on the host, the existing
>> > > security.capability as well as the security.capability@uid=1000 
>> > > will be shown. Inside the namespace only 'security.capability', 
>> > > with the value of security.capability@uid=1000, is visible.
>> > 
>> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
>> > this capability but am dynamically mapping the namespaces (i.e. I 
>> > know I want unprivileged root, but I'm going to dynamically select 
>> > the range to map based on what's currently available on the 
>> > orchestration system).  If we stick with the @uid=X suffix, then 
>> > dynamic mapping won't work because X is potentially different each 
>> > time and there'll be a name mismatch in my xattrs.  Why not just 
>> > make the suffix @uid, which means if root is mapped to any 
>> > unprivileged uid then we pick this up otherwise we go with the
>> > unsuffixed property?
>> > 
>> > As far as I can see there's no real advantage to discriminating 
>> > userns specific xattrs based on where root is mapped to, unless 
>> > there's a use case I'm missing?
>> 
>> Yes, the use case is: to allow root in the container to set the
>> privilege itself, without endangering any resources not owned by
>> that root.
>
> OK, so you envisage the same filesystem being mounted in different user
> namespaces and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

When you dynamically talk about selecting a range based what is
currently available in an orchestration system I don't know exactly what
you mean.  If it is something like what adduser does, assigning a
container a persistent association with uids and gids, that makes sense
to me.  If it is picking an association just for the lifetime of the
conainer processes it makes me nervous.

Fundamentally storage is persistent and writing data into it is
persistent.

Which means that when dealing with storage we need to make things safe
by default and not depend upon an assumption that the container tools
carefully keeps files separate from each other.

From previous conversations I am happy with and generally expect only a
capability xattr per file.

Even with one xattr of any type there is something appealing about
putting the logic that limits that xattr to a namespace in the name.  As
that is trivially backwards compatible.  As that does not require reving
the on disk file format based upon containers.

>> As you say a @uid to say "any unprivileged userns" might be useful.
>> The implication is that root on the host doesn't trust the image
>> enough to write a real global file capability, but trusts it enough
>> to 'endanger' all containers on the host.  If that's the case, I have
>> no objection to adding this as a feature.
>
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
>
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

Mixing use cases in a filesystem almost always happens.  At least if we
are talking an ordinary multi-user system.  Multi-user systems are rarer
than they once were because machines are cheap, and security is hard,
but that should be what we are designing for.  Anything else is just
asking for trouble.

James when you talk about a global flag and mixing use cases in a
physical system it sounds a lot like you are talking about a base
filesystem for shiftfs.

My gut feel is that if this gets down to something like the shiftfs use
case.  I would assume either everything is shifted slightly so that all
uids are say shifted by 100,000 even the capability names of the
capability xattrs.  So that shiftfs or some part of the vfs would need
to shift the names of the xattrs as well.

Certainly I expect filesystems that are mounted with s_user_ns !=
&init_user_ns to be shifting the names of the security xattrs when
queried from &init_user_ns if we go with general design.

Eric

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23  0:13       ` James Bottomley
  (?)
@ 2017-06-23 17:37         ` Eric W. Biederman
  -1 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:37 UTC (permalink / raw)
  To: James Bottomley
  Cc: Serge E. Hallyn, zohar, containers, linux-kernel, xiaolong.ye,
	linux-security-module, lkp

James Bottomley <James.Bottomley@HansenPartnership.com> writes:

> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
>> Quoting James Bottomley (James.Bottomley@HansenPartnership.com):
>> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> > > This series of patches primary goal is to enable file 
>> > > capabilities in user namespaces without affecting the file 
>> > > capabilities that are effective on the host. This is to prevent 
>> > > that any unprivileged user on the host maps his own uid to root 
>> > > in a private namespace, writes the xattr, and executes the file
>> > > with privilege on the host.
>> > > 
>> > > We achieve this goal by writing extended attributes with a 
>> > > different name when a user namespace is used. If for example the 
>> > > root user in a user namespace writes the security.capability 
>> > > xattr, the name of the xattr that is actually written is encoded 
>> > > as security.capability@uid=1000 for root mapped to uid 1000 on 
>> > > the host. When listing the xattrs on the host, the existing
>> > > security.capability as well as the security.capability@uid=1000 
>> > > will be shown. Inside the namespace only 'security.capability', 
>> > > with the value of security.capability@uid=1000, is visible.
>> > 
>> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
>> > this capability but am dynamically mapping the namespaces (i.e. I 
>> > know I want unprivileged root, but I'm going to dynamically select 
>> > the range to map based on what's currently available on the 
>> > orchestration system).  If we stick with the @uid=X suffix, then 
>> > dynamic mapping won't work because X is potentially different each 
>> > time and there'll be a name mismatch in my xattrs.  Why not just 
>> > make the suffix @uid, which means if root is mapped to any 
>> > unprivileged uid then we pick this up otherwise we go with the
>> > unsuffixed property?
>> > 
>> > As far as I can see there's no real advantage to discriminating 
>> > userns specific xattrs based on where root is mapped to, unless 
>> > there's a use case I'm missing?
>> 
>> Yes, the use case is: to allow root in the container to set the
>> privilege itself, without endangering any resources not owned by
>> that root.
>
> OK, so you envisage the same filesystem being mounted in different user
> namespaces and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

When you dynamically talk about selecting a range based what is
currently available in an orchestration system I don't know exactly what
you mean.  If it is something like what adduser does, assigning a
container a persistent association with uids and gids, that makes sense
to me.  If it is picking an association just for the lifetime of the
conainer processes it makes me nervous.

Fundamentally storage is persistent and writing data into it is
persistent.

Which means that when dealing with storage we need to make things safe
by default and not depend upon an assumption that the container tools
carefully keeps files separate from each other.

>From previous conversations I am happy with and generally expect only a
capability xattr per file.

Even with one xattr of any type there is something appealing about
putting the logic that limits that xattr to a namespace in the name.  As
that is trivially backwards compatible.  As that does not require reving
the on disk file format based upon containers.

>> As you say a @uid to say "any unprivileged userns" might be useful.
>> The implication is that root on the host doesn't trust the image
>> enough to write a real global file capability, but trusts it enough
>> to 'endanger' all containers on the host.  If that's the case, I have
>> no objection to adding this as a feature.
>
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
>
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

Mixing use cases in a filesystem almost always happens.  At least if we
are talking an ordinary multi-user system.  Multi-user systems are rarer
than they once were because machines are cheap, and security is hard,
but that should be what we are designing for.  Anything else is just
asking for trouble.

James when you talk about a global flag and mixing use cases in a
physical system it sounds a lot like you are talking about a base
filesystem for shiftfs.

My gut feel is that if this gets down to something like the shiftfs use
case.  I would assume either everything is shifted slightly so that all
uids are say shifted by 100,000 even the capability names of the
capability xattrs.  So that shiftfs or some part of the vfs would need
to shift the names of the xattrs as well.

Certainly I expect filesystems that are mounted with s_user_ns !=
&init_user_ns to be shifting the names of the security xattrs when
queried from &init_user_ns if we go with general design.

Eric

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:37         ` Eric W. Biederman
  0 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:37 UTC (permalink / raw)
  To: linux-security-module

James Bottomley <James.Bottomley@HansenPartnership.com> writes:

> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
>> Quoting James Bottomley (James.Bottomley at HansenPartnership.com):
>> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> > > This series of patches primary goal is to enable file 
>> > > capabilities in user namespaces without affecting the file 
>> > > capabilities that are effective on the host. This is to prevent 
>> > > that any unprivileged user on the host maps his own uid to root 
>> > > in a private namespace, writes the xattr, and executes the file
>> > > with privilege on the host.
>> > > 
>> > > We achieve this goal by writing extended attributes with a 
>> > > different name when a user namespace is used. If for example the 
>> > > root user in a user namespace writes the security.capability 
>> > > xattr, the name of the xattr that is actually written is encoded 
>> > > as security.capability at uid=1000 for root mapped to uid 1000 on 
>> > > the host. When listing the xattrs on the host, the existing
>> > > security.capability as well as the security.capability at uid=1000 
>> > > will be shown. Inside the namespace only 'security.capability', 
>> > > with the value of security.capability at uid=1000, is visible.
>> > 
>> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
>> > this capability but am dynamically mapping the namespaces (i.e. I 
>> > know I want unprivileged root, but I'm going to dynamically select 
>> > the range to map based on what's currently available on the 
>> > orchestration system).  If we stick with the @uid=X suffix, then 
>> > dynamic mapping won't work because X is potentially different each 
>> > time and there'll be a name mismatch in my xattrs.  Why not just 
>> > make the suffix @uid, which means if root is mapped to any 
>> > unprivileged uid then we pick this up otherwise we go with the
>> > unsuffixed property?
>> > 
>> > As far as I can see there's no real advantage to discriminating 
>> > userns specific xattrs based on where root is mapped to, unless 
>> > there's a use case I'm missing?
>> 
>> Yes, the use case is: to allow root in the container to set the
>> privilege itself, without endangering any resources not owned by
>> that root.
>
> OK, so you envisage the same filesystem being mounted in different user
> namespaces and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

When you dynamically talk about selecting a range based what is
currently available in an orchestration system I don't know exactly what
you mean.  If it is something like what adduser does, assigning a
container a persistent association with uids and gids, that makes sense
to me.  If it is picking an association just for the lifetime of the
conainer processes it makes me nervous.

Fundamentally storage is persistent and writing data into it is
persistent.

Which means that when dealing with storage we need to make things safe
by default and not depend upon an assumption that the container tools
carefully keeps files separate from each other.

>From previous conversations I am happy with and generally expect only a
capability xattr per file.

Even with one xattr of any type there is something appealing about
putting the logic that limits that xattr to a namespace in the name.  As
that is trivially backwards compatible.  As that does not require reving
the on disk file format based upon containers.

>> As you say a @uid to say "any unprivileged userns" might be useful.
>> The implication is that root on the host doesn't trust the image
>> enough to write a real global file capability, but trusts it enough
>> to 'endanger' all containers on the host.  If that's the case, I have
>> no objection to adding this as a feature.
>
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
>
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

Mixing use cases in a filesystem almost always happens.  At least if we
are talking an ordinary multi-user system.  Multi-user systems are rarer
than they once were because machines are cheap, and security is hard,
but that should be what we are designing for.  Anything else is just
asking for trouble.

James when you talk about a global flag and mixing use cases in a
physical system it sounds a lot like you are talking about a base
filesystem for shiftfs.

My gut feel is that if this gets down to something like the shiftfs use
case.  I would assume either everything is shifted slightly so that all
uids are say shifted by 100,000 even the capability names of the
capability xattrs.  So that shiftfs or some part of the vfs would need
to shift the names of the xattrs as well.

Certainly I expect filesystems that are mounted with s_user_ns !=
&init_user_ns to be shifting the names of the security xattrs when
queried from &init_user_ns if we go with general design.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:37         ` Eric W. Biederman
  0 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:37 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 5252 bytes --]

James Bottomley <James.Bottomley@HansenPartnership.com> writes:

> On Thu, 2017-06-22 at 18:36 -0500, Serge E. Hallyn wrote:
>> Quoting James Bottomley (James.Bottomley(a)HansenPartnership.com):
>> > On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> > > This series of patches primary goal is to enable file 
>> > > capabilities in user namespaces without affecting the file 
>> > > capabilities that are effective on the host. This is to prevent 
>> > > that any unprivileged user on the host maps his own uid to root 
>> > > in a private namespace, writes the xattr, and executes the file
>> > > with privilege on the host.
>> > > 
>> > > We achieve this goal by writing extended attributes with a 
>> > > different name when a user namespace is used. If for example the 
>> > > root user in a user namespace writes the security.capability 
>> > > xattr, the name of the xattr that is actually written is encoded 
>> > > as security.capability(a)uid=1000 for root mapped to uid 1000 on 
>> > > the host. When listing the xattrs on the host, the existing
>> > > security.capability as well as the security.capability(a)uid=1000 
>> > > will be shown. Inside the namespace only 'security.capability', 
>> > > with the value of security.capability(a)uid=1000, is visible.
>> > 
>> > I'm a bit bothered by the @uid=1000 suffix.  What if I want to use 
>> > this capability but am dynamically mapping the namespaces (i.e. I 
>> > know I want unprivileged root, but I'm going to dynamically select 
>> > the range to map based on what's currently available on the 
>> > orchestration system).  If we stick with the @uid=X suffix, then 
>> > dynamic mapping won't work because X is potentially different each 
>> > time and there'll be a name mismatch in my xattrs.  Why not just 
>> > make the suffix @uid, which means if root is mapped to any 
>> > unprivileged uid then we pick this up otherwise we go with the
>> > unsuffixed property?
>> > 
>> > As far as I can see there's no real advantage to discriminating 
>> > userns specific xattrs based on where root is mapped to, unless 
>> > there's a use case I'm missing?
>> 
>> Yes, the use case is: to allow root in the container to set the
>> privilege itself, without endangering any resources not owned by
>> that root.
>
> OK, so you envisage the same filesystem being mounted in different user
> namespaces and being able to see their own value for the xattr.  It
> still seems a bit weird that they'd be able to change file contents and
> have that seen by the other userns but not xattrs.

When you dynamically talk about selecting a range based what is
currently available in an orchestration system I don't know exactly what
you mean.  If it is something like what adduser does, assigning a
container a persistent association with uids and gids, that makes sense
to me.  If it is picking an association just for the lifetime of the
conainer processes it makes me nervous.

Fundamentally storage is persistent and writing data into it is
persistent.

Which means that when dealing with storage we need to make things safe
by default and not depend upon an assumption that the container tools
carefully keeps files separate from each other.

>From previous conversations I am happy with and generally expect only a
capability xattr per file.

Even with one xattr of any type there is something appealing about
putting the logic that limits that xattr to a namespace in the name.  As
that is trivially backwards compatible.  As that does not require reving
the on disk file format based upon containers.

>> As you say a @uid to say "any unprivileged userns" might be useful.
>> The implication is that root on the host doesn't trust the image
>> enough to write a real global file capability, but trusts it enough
>> to 'endanger' all containers on the host.  If that's the case, I have
>> no objection to adding this as a feature.
>
> Yes, precisely.  The filesystem is certified as permitted to override
> the xattr whatever unprivileged mapping for root is in place.
>
> How would we effect the switch?  I suppose some global flag because I
> can't see we'd be mixing use cases in a physical system.

Mixing use cases in a filesystem almost always happens.  At least if we
are talking an ordinary multi-user system.  Multi-user systems are rarer
than they once were because machines are cheap, and security is hard,
but that should be what we are designing for.  Anything else is just
asking for trouble.

James when you talk about a global flag and mixing use cases in a
physical system it sounds a lot like you are talking about a base
filesystem for shiftfs.

My gut feel is that if this gets down to something like the shiftfs use
case.  I would assume either everything is shifted slightly so that all
uids are say shifted by 100,000 even the capability names of the
capability xattrs.  So that shiftfs or some part of the vfs would need
to shift the names of the xattrs as well.

Certainly I expect filesystems that are mounted with s_user_ns !=
&init_user_ns to be shifting the names of the security xattrs when
queried from &init_user_ns if we go with general design.

Eric


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]               ` <1498237641.3641.15.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-06-23 17:18                 ` Aleksa Sarai
  2017-06-23 17:20                   ` Serge E. Hallyn
@ 2017-06-23 17:38                 ` Stefan Berger
  2 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 17:38 UTC (permalink / raw)
  To: James Bottomley, Serge E. Hallyn, Casey Schaufler
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, LSM List, Eric W. Biederman,
	lkp-JC7UmRfGjtg

On 06/23/2017 01:07 PM, James Bottomley wrote:
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
>> Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
>>> Or maybe just security.ns.capability, taking James' comment into
>>> account.
>> That last one may be suitable as an option, useful for his particular
>> (somewhat barbaric :) use case, but it's not ok for the general
>> solution.
>>
>> If uid 1000 was delegated the subuids 100000-199999, it should be
>> able to write a file capability for use by his subuids, but that file
>> capability must not apply to other subuids.
> I don't think it's barbaric, I think it's the common use case.  Let me
> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
>
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature.
>   Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima@uid=1000 when I do a docker
> save.
>
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima@uid=4531

Yes. An answer would be to have Docker rewrite these on the fly. It 
knows what uid the container was running as and specifically looks for 
security.ima@uid=1000 or security.ima, takes the former if it finds, 
otherwise the latter or nothing.

    Stefan

>
> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.
>
> James
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 17:07               ` James Bottomley
  (?)
@ 2017-06-23 17:38                 ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 17:38 UTC (permalink / raw)
  To: James Bottomley, Serge E. Hallyn, Casey Schaufler
  Cc: Amir Goldstein, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	christian.brauner, Vivek Goyal, LSM List

On 06/23/2017 01:07 PM, James Bottomley wrote:
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
>> Quoting Casey Schaufler (casey@schaufler-ca.com):
>>> Or maybe just security.ns.capability, taking James' comment into
>>> account.
>> That last one may be suitable as an option, useful for his particular
>> (somewhat barbaric :) use case, but it's not ok for the general
>> solution.
>>
>> If uid 1000 was delegated the subuids 100000-199999, it should be
>> able to write a file capability for use by his subuids, but that file
>> capability must not apply to other subuids.
> I don't think it's barbaric, I think it's the common use case.  Let me
> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
>
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature.
>   Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima@uid=1000 when I do a docker
> save.
>
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima@uid=4531

Yes. An answer would be to have Docker rewrite these on the fly. It 
knows what uid the container was running as and specifically looks for 
security.ima@uid=1000 or security.ima, takes the former if it finds, 
otherwise the latter or nothing.

    Stefan

>
> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.
>
> James
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:38                 ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 17:38 UTC (permalink / raw)
  To: linux-security-module

On 06/23/2017 01:07 PM, James Bottomley wrote:
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
>> Quoting Casey Schaufler (casey at schaufler-ca.com):
>>> Or maybe just security.ns.capability, taking James' comment into
>>> account.
>> That last one may be suitable as an option, useful for his particular
>> (somewhat barbaric :) use case, but it's not ok for the general
>> solution.
>>
>> If uid 1000 was delegated the subuids 100000-199999, it should be
>> able to write a file capability for use by his subuids, but that file
>> capability must not apply to other subuids.
> I don't think it's barbaric, I think it's the common use case.  Let me
> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
>
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature.
>   Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima at uid=1000 when I do a docker
> save.
>
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima at uid=4531

Yes. An answer would be to have Docker rewrite these on the fly. It 
knows what uid the container was running as and specifically looks for 
security.ima at uid=1000 or security.ima, takes the former if it finds, 
otherwise the latter or nothing.

    Stefan

>
> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.
>
> James
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:38                 ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 17:38 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1976 bytes --]

On 06/23/2017 01:07 PM, James Bottomley wrote:
> On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
>> Quoting Casey Schaufler (casey(a)schaufler-ca.com):
>>> Or maybe just security.ns.capability, taking James' comment into
>>> account.
>> That last one may be suitable as an option, useful for his particular
>> (somewhat barbaric :) use case, but it's not ok for the general
>> solution.
>>
>> If uid 1000 was delegated the subuids 100000-199999, it should be
>> able to write a file capability for use by his subuids, but that file
>> capability must not apply to other subuids.
> I don't think it's barbaric, I think it's the common use case.  Let me
> give a more comprehensible answer in terms of docker and IMA.  Lets
> suppose I'm running docker locally and in a test cloud both with userns
> enabled.
>
> I build an image locally, mapping my uid (1000) to root.  If I begin
> with a standard base, each of the files has a security.ima signature.
>   Now I add my layer, which involves updating a file, so I need to write
> a new signature to security.ima.  Because I'm running user namespaced,
> the update gets written at security.ima(a)uid=1000 when I do a docker
> save.
>
> Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> gives me real uid 4531 and maps that to root.  Execution of the binary
> fails because it tries to use the underlying signature (in
> security.ima) as there is no xattr named security.ima(a)uid=4531

Yes. An answer would be to have Docker rewrite these on the fly. It 
knows what uid the container was running as and specifically looks for 
security.ima(a)uid=1000 or security.ima, takes the former if it finds, 
otherwise the latter or nothing.

    Stefan

>
> So my essential point is that building the real kuid into the permanent
> record of the xattr damages image portability, which is touted as one
> of the real advantages of container images.
>
> James
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                 ` <20170623170108.GA19354-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-23 17:49                   ` Eric W. Biederman
  0 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:49 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Casey Schaufler,
	lkp-JC7UmRfGjtg

"Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:

> Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
>> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
>> > Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
>> >> Or maybe just security.ns.capability, taking James' comment into account.
>> > That last one may be suitable as an option, useful for his particular
>> > (somewhat barbaric :) use case, but it's not ok for the general solution.
>> 
>> security.ns@uid=100.capability
>
> I'm ok with this.  It gives protection from older kernels, and puts
> the 'ns@uid=' at predictable locations for security and trusted.
>
>> It makes the namespace part explicit and separate from
>> the rest of the attribute name. It also generalizes for
>> other attributes.
>> 
>> security.ns@uid=1000@smack=WestOfOne.SMACK64
>
> Looks good to me.
>
> Do we want to say that '.' ends the attribute list?  That of
> course means '.' cannot be in the attributes.  Perhaps end
> with '@@' instead?  Just a thought.
>
> What do others think?

I think we have two things that will limit the allowed attributes
severely.

1) We need to the names of all of the xattrs when mounting a filesystem
   with s_user_ns != &init_user_ns.  I haven't yet checked the patches
   to see if they do this properly.

2) Names of xattrs are not fully general and filesystems perform various
   tricks to encode them more densely.  We should see what the games
   with xattr names do to how densely xattrs can be stored on disk.
   That matters.

Putting the uid of the root user in the name sounds fundamental to doing
things this way.  I am not at all certain about putting smack labels and
generally treating this as something we can add two arbitrarily.

If nothing else this reminds me of the frequent problem in
certifications with ouids.  Arbitrary attributes tend to defeat parsers
in a security context on a regular basis.  Even the kernel command line
parser has seen problems in this area, and it isn't security sensitive
most of the time.

Extensibility is good in the abstract long term sense.  Extensibility in
the here and now where we don't even know which attributes we are
talking about scares me.  I don't see how we can possibily know with
multiple attributes which xattrs will be the one to use.  As we won't
even know which properties of the kernel to look at to match attributes.

So while I don't mind reorganizing the order we put the information into
the attribute.  Let's keep what we place in there very specific.

Eric

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 17:01                 ` Serge E. Hallyn
  (?)
@ 2017-06-23 17:49                   ` Eric W. Biederman
  -1 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:49 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Casey Schaufler, Amir Goldstein, Stefan Berger, Linux Containers,
	lkp, xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	James Bottomley, christian.brauner, Vivek Goyal, LSM List

"Serge E. Hallyn" <serge@hallyn.com> writes:

> Quoting Casey Schaufler (casey@schaufler-ca.com):
>> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
>> > Quoting Casey Schaufler (casey@schaufler-ca.com):
>> >> Or maybe just security.ns.capability, taking James' comment into account.
>> > That last one may be suitable as an option, useful for his particular
>> > (somewhat barbaric :) use case, but it's not ok for the general solution.
>> 
>> security.ns@uid=100.capability
>
> I'm ok with this.  It gives protection from older kernels, and puts
> the 'ns@uid=' at predictable locations for security and trusted.
>
>> It makes the namespace part explicit and separate from
>> the rest of the attribute name. It also generalizes for
>> other attributes.
>> 
>> security.ns@uid=1000@smack=WestOfOne.SMACK64
>
> Looks good to me.
>
> Do we want to say that '.' ends the attribute list?  That of
> course means '.' cannot be in the attributes.  Perhaps end
> with '@@' instead?  Just a thought.
>
> What do others think?

I think we have two things that will limit the allowed attributes
severely.

1) We need to the names of all of the xattrs when mounting a filesystem
   with s_user_ns != &init_user_ns.  I haven't yet checked the patches
   to see if they do this properly.

2) Names of xattrs are not fully general and filesystems perform various
   tricks to encode them more densely.  We should see what the games
   with xattr names do to how densely xattrs can be stored on disk.
   That matters.

Putting the uid of the root user in the name sounds fundamental to doing
things this way.  I am not at all certain about putting smack labels and
generally treating this as something we can add two arbitrarily.

If nothing else this reminds me of the frequent problem in
certifications with ouids.  Arbitrary attributes tend to defeat parsers
in a security context on a regular basis.  Even the kernel command line
parser has seen problems in this area, and it isn't security sensitive
most of the time.

Extensibility is good in the abstract long term sense.  Extensibility in
the here and now where we don't even know which attributes we are
talking about scares me.  I don't see how we can possibily know with
multiple attributes which xattrs will be the one to use.  As we won't
even know which properties of the kernel to look at to match attributes.

So while I don't mind reorganizing the order we put the information into
the attribute.  Let's keep what we place in there very specific.

Eric

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:49                   ` Eric W. Biederman
  0 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:49 UTC (permalink / raw)
  To: linux-security-module

"Serge E. Hallyn" <serge@hallyn.com> writes:

> Quoting Casey Schaufler (casey at schaufler-ca.com):
>> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
>> > Quoting Casey Schaufler (casey at schaufler-ca.com):
>> >> Or maybe just security.ns.capability, taking James' comment into account.
>> > That last one may be suitable as an option, useful for his particular
>> > (somewhat barbaric :) use case, but it's not ok for the general solution.
>> 
>> security.ns at uid=100.capability
>
> I'm ok with this.  It gives protection from older kernels, and puts
> the 'ns at uid=' at predictable locations for security and trusted.
>
>> It makes the namespace part explicit and separate from
>> the rest of the attribute name. It also generalizes for
>> other attributes.
>> 
>> security.ns at uid=1000 at smack=WestOfOne.SMACK64
>
> Looks good to me.
>
> Do we want to say that '.' ends the attribute list?  That of
> course means '.' cannot be in the attributes.  Perhaps end
> with '@@' instead?  Just a thought.
>
> What do others think?

I think we have two things that will limit the allowed attributes
severely.

1) We need to the names of all of the xattrs when mounting a filesystem
   with s_user_ns != &init_user_ns.  I haven't yet checked the patches
   to see if they do this properly.

2) Names of xattrs are not fully general and filesystems perform various
   tricks to encode them more densely.  We should see what the games
   with xattr names do to how densely xattrs can be stored on disk.
   That matters.

Putting the uid of the root user in the name sounds fundamental to doing
things this way.  I am not at all certain about putting smack labels and
generally treating this as something we can add two arbitrarily.

If nothing else this reminds me of the frequent problem in
certifications with ouids.  Arbitrary attributes tend to defeat parsers
in a security context on a regular basis.  Even the kernel command line
parser has seen problems in this area, and it isn't security sensitive
most of the time.

Extensibility is good in the abstract long term sense.  Extensibility in
the here and now where we don't even know which attributes we are
talking about scares me.  I don't see how we can possibily know with
multiple attributes which xattrs will be the one to use.  As we won't
even know which properties of the kernel to look at to match attributes.

So while I don't mind reorganizing the order we put the information into
the attribute.  Let's keep what we place in there very specific.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 17:49                   ` Eric W. Biederman
  0 siblings, 0 replies; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-23 17:49 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2577 bytes --]

"Serge E. Hallyn" <serge@hallyn.com> writes:

> Quoting Casey Schaufler (casey(a)schaufler-ca.com):
>> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
>> > Quoting Casey Schaufler (casey(a)schaufler-ca.com):
>> >> Or maybe just security.ns.capability, taking James' comment into account.
>> > That last one may be suitable as an option, useful for his particular
>> > (somewhat barbaric :) use case, but it's not ok for the general solution.
>> 
>> security.ns(a)uid=100.capability
>
> I'm ok with this.  It gives protection from older kernels, and puts
> the 'ns(a)uid=' at predictable locations for security and trusted.
>
>> It makes the namespace part explicit and separate from
>> the rest of the attribute name. It also generalizes for
>> other attributes.
>> 
>> security.ns(a)uid=1000(a)smack=WestOfOne.SMACK64
>
> Looks good to me.
>
> Do we want to say that '.' ends the attribute list?  That of
> course means '.' cannot be in the attributes.  Perhaps end
> with '@@' instead?  Just a thought.
>
> What do others think?

I think we have two things that will limit the allowed attributes
severely.

1) We need to the names of all of the xattrs when mounting a filesystem
   with s_user_ns != &init_user_ns.  I haven't yet checked the patches
   to see if they do this properly.

2) Names of xattrs are not fully general and filesystems perform various
   tricks to encode them more densely.  We should see what the games
   with xattr names do to how densely xattrs can be stored on disk.
   That matters.

Putting the uid of the root user in the name sounds fundamental to doing
things this way.  I am not at all certain about putting smack labels and
generally treating this as something we can add two arbitrarily.

If nothing else this reminds me of the frequent problem in
certifications with ouids.  Arbitrary attributes tend to defeat parsers
in a security context on a regular basis.  Even the kernel command line
parser has seen problems in this area, and it isn't security sensitive
most of the time.

Extensibility is good in the abstract long term sense.  Extensibility in
the here and now where we don't even know which attributes we are
talking about scares me.  I don't see how we can possibily know with
multiple attributes which xattrs will be the one to use.  As we won't
even know which properties of the kernel to look at to match attributes.

So while I don't mind reorganizing the order we put the information into
the attribute.  Let's keep what we place in there very specific.

Eric


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]           ` <aa62373e-7cd6-39dd-2e38-2b6d6dbe18a8-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
  2017-06-23 16:30             ` Serge E. Hallyn
@ 2017-06-23 18:08             ` Stefan Berger
  1 sibling, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 18:08 UTC (permalink / raw)
  To: Casey Schaufler, Serge E. Hallyn, Amir Goldstein
  Cc: lkp-JC7UmRfGjtg, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, Mimi Zohar

On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>> Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability@uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>> You're the only one to have mentioned it so far.
>>
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability@uid=1000 or
>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo@uid=0"
>> I like it for simplifying the parser code.  One concern I have is that,
>> since ns.* is currently not gated, one could write ns.* on an older
>> kernel and then exploit it on a newer one.
> security.ns.capability@uid=1000, then?

Imo, '.ns' is redundant and 'encoded' in the '@'.

    Stefan

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 16:16           ` Casey Schaufler
  (?)
@ 2017-06-23 18:08             ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 18:08 UTC (permalink / raw)
  To: Casey Schaufler, Serge E. Hallyn, Amir Goldstein
  Cc: Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Tycho Andersen, James Bottomley,
	christian.brauner, Vivek Goyal, LSM List

On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>> Quoting Amir Goldstein (amir73il@gmail.com):
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability@uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>> You're the only one to have mentioned it so far.
>>
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability@uid=1000 or
>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo@uid=0"
>> I like it for simplifying the parser code.  One concern I have is that,
>> since ns.* is currently not gated, one could write ns.* on an older
>> kernel and then exploit it on a newer one.
> security.ns.capability@uid=1000, then?

Imo, '.ns' is redundant and 'encoded' in the '@'.

    Stefan

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 18:08             ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 18:08 UTC (permalink / raw)
  To: linux-security-module

On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>> Quoting Amir Goldstein (amir73il at gmail.com):
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability at uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>> You're the only one to have mentioned it so far.
>>
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability at uid=1000 or
>>> ns at uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo at uid=0"
>> I like it for simplifying the parser code.  One concern I have is that,
>> since ns.* is currently not gated, one could write ns.* on an older
>> kernel and then exploit it on a newer one.
> security.ns.capability at uid=1000, then?

Imo, '.ns' is redundant and 'encoded' in the '@'.

    Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 18:08             ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 18:08 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2236 bytes --]

On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>> Quoting Amir Goldstein (amir73il(a)gmail.com):
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability(a)uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>> You're the only one to have mentioned it so far.
>>
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability(a)uid=1000 or
>>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo(a)uid=0"
>> I like it for simplifying the parser code.  One concern I have is that,
>> since ns.* is currently not gated, one could write ns.* on an older
>> kernel and then exploit it on a newer one.
> security.ns.capability(a)uid=1000, then?

Imo, '.ns' is redundant and 'encoded' in the '@'.

    Stefan


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                   ` <b57803da-0e8b-594d-901b-12eb509f04b5-l3A5Bk7waGM@public.gmane.org>
@ 2017-06-23 18:22                     ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:22 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Quoting Aleksa Sarai (asarai-l3A5Bk7waGM@public.gmane.org):
> >>>Or maybe just security.ns.capability, taking James' comment into
> >>>account.
> >>
> >>That last one may be suitable as an option, useful for his particular
> >>(somewhat barbaric :) use case, but it's not ok for the general
> >>solution.
> >>
> >>If uid 1000 was delegated the subuids 100000-199999, it should be
> >>able to write a file capability for use by his subuids, but that file
> >>capability must not apply to other subuids.
> 
> [As an aside, I think the delegation model of sub[ug]id is not the
> right way of thinking about these features. The properties of a file
> capability should be considered based on the user namespace it is
> being evaluated in, not who set the capability or delegated the
> [ug]ids. I might be misunderstanding the problem though.]
> 
> >I don't think it's barbaric, I think it's the common use case.  Let me
> >give a more comprehensible answer in terms of docker and IMA.  Lets
> >suppose I'm running docker locally and in a test cloud both with userns
> >enabled.
> >
> >I build an image locally, mapping my uid (1000) to root.  If I begin
> >with a standard base, each of the files has a security.ima signature.
> >  Now I add my layer, which involves updating a file, so I need to write
> >a new signature to security.ima.  Because I'm running user namespaced,
> >the update gets written at security.ima@uid=1000 when I do a docker
> >save.
> >
> >Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> >gives me real uid 4531 and maps that to root.  Execution of the binary
> >fails because it tries to use the underlying signature (in
> >security.ima) as there is no xattr named security.ima@uid=4531
> >
> >So my essential point is that building the real kuid into the permanent
> >record of the xattr damages image portability, which is touted as one
> >of the real advantages of container images.

I didn't address this in my reply to James, but maybe I should have.

'container images' are a higher level marketing slogan^W^W concept.
Not a kernel concept.

The kernel does not have a concept of fluid uids, or fluid kuids, so
demanding that xattrs be fluid in this way seems wrong.

As it is, 'shiftfs' is not upstream, so the more basic problems of
container images or not solved in the way you'd like either.

Certainly we want to make this as useful as possible to container
images, but that only goes to a point.

I'm open to good ideas for solving this issue we're discussing here,
but that will not include allowing an unprivileged user to write
an xattr which can give privilege to containers over which he does
not have privilege.

The core problem is "how do we allow the delegation of this privilege
on persistent disk without that privilege leaking".  As Casey had said
we would ideally be able to refer to a 'container' itself, instead of
the root kuid.  That *would* be better.  But that does not exist.

It also doesn't stand in the way of my use case.  My use case is fully
unprivileged containers.  Where I want to install/upgrade software
using file capabilities.  Currently I cannot do that.  With either this
patch or my own namespaced filecaps patch, I can, safely.  If I want to shift
the container's backing uids, then I have to create transient namespaces in
which do the uid shifting, and I can just as well do the xattr shifting there
as well, just as I have to re-set setuid/setgid etc.

> Definitely. This is somewhat of a similar issue to shiftfs (that the
> {uid,gid} being hardcoded in images makes it very difficult to make
> user namespaced containers work sanely).
> 
> Since OCI images are all basically a set of tar archives with JSON
> manifests, to make things like this work nicely for container images
> it just has to avoid hardcoding kuid (or even userspace uids) into
> attributes. It is possible for us to write special-case translation
> code for images, but that will cause the same issues that shiftfs is
> trying to resolve.

It would add another issue analogous to the ones shiftfs is trying to
solve, so the question is can shifts solve it?  

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                       ` <553a72c4-eda9-52d6-2ae2-f8687c0c7c70-l3A5Bk7waGM@public.gmane.org>
@ 2017-06-23 18:30                         ` Serge E. Hallyn
  2017-06-25 12:35                         ` Eric W. Biederman
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:30 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Quoting Aleksa Sarai (asarai-l3A5Bk7waGM@public.gmane.org):
> >>So my essential point is that building the real kuid into the permanent
> >>record of the xattr damages image portability, which is touted as one
> >>of the real advantages of container images.
> >
> >'container images' aren't portable in that sense now - for at least
> >many cases - because you have to shift the uid.  However you're doing
> >that, you may be able to shift the xattr the same way.
> 
> Piling more things on top of that issue isn't going to make the
> issue easier to solve IMO. Would shiftfs or shift-bindmounts also
> have to do translation of arbitrary xattrs? Plus I would think that
> handling xattrs would be harder than {u,g}ids because the image
> unpacker now has to be aware of all xattrs that require remapping
> (Which might be an ever-growing list).

No, that's actually the point of Stefan's patch (versus mine).  It
should be able to generically map anything that is 'security.foo@uid=100'
or 'trusted.foo@uid=100'.  It doesn't need more intelligence than that.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                   ` <8760fmh9vc.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-06-23 18:32                     ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:32 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Casey Schaufler,
	lkp-JC7UmRfGjtg

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
> 
> > Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> >> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> >> > Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> >> >> Or maybe just security.ns.capability, taking James' comment into account.
> >> > That last one may be suitable as an option, useful for his particular
> >> > (somewhat barbaric :) use case, but it's not ok for the general solution.
> >> 
> >> security.ns@uid=100.capability
> >
> > I'm ok with this.  It gives protection from older kernels, and puts
> > the 'ns@uid=' at predictable locations for security and trusted.
> >
> >> It makes the namespace part explicit and separate from
> >> the rest of the attribute name. It also generalizes for
> >> other attributes.
> >> 
> >> security.ns@uid=1000@smack=WestOfOne.SMACK64
> >
> > Looks good to me.
> >
> > Do we want to say that '.' ends the attribute list?  That of
> > course means '.' cannot be in the attributes.  Perhaps end
> > with '@@' instead?  Just a thought.
> >
> > What do others think?
> 
> I think we have two things that will limit the allowed attributes
> severely.
> 
> 1) We need to the names of all of the xattrs when mounting a filesystem
>    with s_user_ns != &init_user_ns.  I haven't yet checked the patches
>    to see if they do this properly.
> 
> 2) Names of xattrs are not fully general and filesystems perform various
>    tricks to encode them more densely.  We should see what the games
>    with xattr names do to how densely xattrs can be stored on disk.
>    That matters.
> 
> Putting the uid of the root user in the name sounds fundamental to doing
> things this way.  I am not at all certain about putting smack labels and
> generally treating this as something we can add two arbitrarily.
> 
> If nothing else this reminds me of the frequent problem in
> certifications with ouids.  Arbitrary attributes tend to defeat parsers
> in a security context on a regular basis.  Even the kernel command line
> parser has seen problems in this area, and it isn't security sensitive
> most of the time.
> 
> Extensibility is good in the abstract long term sense.  Extensibility in
> the here and now where we don't even know which attributes we are
> talking about scares me.  I don't see how we can possibily know with
> multiple attributes which xattrs will be the one to use.  As we won't
> even know which properties of the kernel to look at to match attributes.
> 
> So while I don't mind reorganizing the order we put the information into
> the attribute.  Let's keep what we place in there very specific.

Right.  I'm in favor of making the syntax so that it is, in the future,
if we want it to be, extensible, but we would not be accepting generic
attributes now.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 17:49                   ` Eric W. Biederman
@ 2017-06-23 18:32                     ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:32 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Casey Schaufler, Amir Goldstein, Stefan Berger,
	Linux Containers, lkp, xiaolong.ye, linux-kernel, Mimi Zohar,
	Tycho Andersen, James Bottomley, christian.brauner, Vivek Goyal,
	LSM List

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serge@hallyn.com> writes:
> 
> > Quoting Casey Schaufler (casey@schaufler-ca.com):
> >> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> >> > Quoting Casey Schaufler (casey@schaufler-ca.com):
> >> >> Or maybe just security.ns.capability, taking James' comment into account.
> >> > That last one may be suitable as an option, useful for his particular
> >> > (somewhat barbaric :) use case, but it's not ok for the general solution.
> >> 
> >> security.ns@uid=100.capability
> >
> > I'm ok with this.  It gives protection from older kernels, and puts
> > the 'ns@uid=' at predictable locations for security and trusted.
> >
> >> It makes the namespace part explicit and separate from
> >> the rest of the attribute name. It also generalizes for
> >> other attributes.
> >> 
> >> security.ns@uid=1000@smack=WestOfOne.SMACK64
> >
> > Looks good to me.
> >
> > Do we want to say that '.' ends the attribute list?  That of
> > course means '.' cannot be in the attributes.  Perhaps end
> > with '@@' instead?  Just a thought.
> >
> > What do others think?
> 
> I think we have two things that will limit the allowed attributes
> severely.
> 
> 1) We need to the names of all of the xattrs when mounting a filesystem
>    with s_user_ns != &init_user_ns.  I haven't yet checked the patches
>    to see if they do this properly.
> 
> 2) Names of xattrs are not fully general and filesystems perform various
>    tricks to encode them more densely.  We should see what the games
>    with xattr names do to how densely xattrs can be stored on disk.
>    That matters.
> 
> Putting the uid of the root user in the name sounds fundamental to doing
> things this way.  I am not at all certain about putting smack labels and
> generally treating this as something we can add two arbitrarily.
> 
> If nothing else this reminds me of the frequent problem in
> certifications with ouids.  Arbitrary attributes tend to defeat parsers
> in a security context on a regular basis.  Even the kernel command line
> parser has seen problems in this area, and it isn't security sensitive
> most of the time.
> 
> Extensibility is good in the abstract long term sense.  Extensibility in
> the here and now where we don't even know which attributes we are
> talking about scares me.  I don't see how we can possibily know with
> multiple attributes which xattrs will be the one to use.  As we won't
> even know which properties of the kernel to look at to match attributes.
> 
> So while I don't mind reorganizing the order we put the information into
> the attribute.  Let's keep what we place in there very specific.

Right.  I'm in favor of making the syntax so that it is, in the future,
if we want it to be, extensible, but we would not be accepting generic
attributes now.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 18:32                     ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:32 UTC (permalink / raw)
  To: linux-security-module

Quoting Eric W. Biederman (ebiederm at xmission.com):
> "Serge E. Hallyn" <serge@hallyn.com> writes:
> 
> > Quoting Casey Schaufler (casey at schaufler-ca.com):
> >> On 6/23/2017 9:30 AM, Serge E. Hallyn wrote:
> >> > Quoting Casey Schaufler (casey at schaufler-ca.com):
> >> >> Or maybe just security.ns.capability, taking James' comment into account.
> >> > That last one may be suitable as an option, useful for his particular
> >> > (somewhat barbaric :) use case, but it's not ok for the general solution.
> >> 
> >> security.ns at uid=100.capability
> >
> > I'm ok with this.  It gives protection from older kernels, and puts
> > the 'ns at uid=' at predictable locations for security and trusted.
> >
> >> It makes the namespace part explicit and separate from
> >> the rest of the attribute name. It also generalizes for
> >> other attributes.
> >> 
> >> security.ns at uid=1000 at smack=WestOfOne.SMACK64
> >
> > Looks good to me.
> >
> > Do we want to say that '.' ends the attribute list?  That of
> > course means '.' cannot be in the attributes.  Perhaps end
> > with '@@' instead?  Just a thought.
> >
> > What do others think?
> 
> I think we have two things that will limit the allowed attributes
> severely.
> 
> 1) We need to the names of all of the xattrs when mounting a filesystem
>    with s_user_ns != &init_user_ns.  I haven't yet checked the patches
>    to see if they do this properly.
> 
> 2) Names of xattrs are not fully general and filesystems perform various
>    tricks to encode them more densely.  We should see what the games
>    with xattr names do to how densely xattrs can be stored on disk.
>    That matters.
> 
> Putting the uid of the root user in the name sounds fundamental to doing
> things this way.  I am not at all certain about putting smack labels and
> generally treating this as something we can add two arbitrarily.
> 
> If nothing else this reminds me of the frequent problem in
> certifications with ouids.  Arbitrary attributes tend to defeat parsers
> in a security context on a regular basis.  Even the kernel command line
> parser has seen problems in this area, and it isn't security sensitive
> most of the time.
> 
> Extensibility is good in the abstract long term sense.  Extensibility in
> the here and now where we don't even know which attributes we are
> talking about scares me.  I don't see how we can possibily know with
> multiple attributes which xattrs will be the one to use.  As we won't
> even know which properties of the kernel to look at to match attributes.
> 
> So while I don't mind reorganizing the order we put the information into
> the attribute.  Let's keep what we place in there very specific.

Right.  I'm in favor of making the syntax so that it is, in the future,
if we want it to be, extensible, but we would not be accepting generic
attributes now.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                 ` <d288ea69-adec-f257-30cb-b1d9c57c996b-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2017-06-23 18:34                   ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:34 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	Casey Schaufler, lkp-JC7UmRfGjtg

Quoting Stefan Berger (stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
> On 06/23/2017 01:07 PM, James Bottomley wrote:
> >On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> >>Quoting Casey Schaufler (casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org):
> >>>Or maybe just security.ns.capability, taking James' comment into
> >>>account.
> >>That last one may be suitable as an option, useful for his particular
> >>(somewhat barbaric :) use case, but it's not ok for the general
> >>solution.
> >>
> >>If uid 1000 was delegated the subuids 100000-199999, it should be
> >>able to write a file capability for use by his subuids, but that file
> >>capability must not apply to other subuids.
> >I don't think it's barbaric, I think it's the common use case.  Let me
> >give a more comprehensible answer in terms of docker and IMA.  Lets
> >suppose I'm running docker locally and in a test cloud both with userns
> >enabled.
> >
> >I build an image locally, mapping my uid (1000) to root.  If I begin
> >with a standard base, each of the files has a security.ima signature.
> >  Now I add my layer, which involves updating a file, so I need to write
> >a new signature to security.ima.  Because I'm running user namespaced,
> >the update gets written at security.ima@uid=1000 when I do a docker
> >save.
> >
> >Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> >gives me real uid 4531 and maps that to root.  Execution of the binary
> >fails because it tries to use the underlying signature (in
> >security.ima) as there is no xattr named security.ima@uid=4531
> 
> Yes. An answer would be to have Docker rewrite these on the fly. It
> knows what uid the container was running as and specifically looks
> for security.ima@uid=1000 or security.ima, takes the former if it
> finds, otherwise the latter or nothing.

I know many people hate this answer, but I just want to point out that
on my little laptop, while untarring a 500M images takes 9.5 seconds,
remapping all uids and gids and restoring setuid+setgid on that image
takes .01s.

It's high cpu utilization, and it's not zero time, but it's very fast,
and it's 100% safe (when done the right way, not "sudo domychown").

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 17:38                 ` Stefan Berger
@ 2017-06-23 18:34                   ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:34 UTC (permalink / raw)
  To: Stefan Berger
  Cc: James Bottomley, Serge E. Hallyn, Casey Schaufler,
	Amir Goldstein, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	christian.brauner, Vivek Goyal, LSM List

Quoting Stefan Berger (stefanb@linux.vnet.ibm.com):
> On 06/23/2017 01:07 PM, James Bottomley wrote:
> >On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> >>Quoting Casey Schaufler (casey@schaufler-ca.com):
> >>>Or maybe just security.ns.capability, taking James' comment into
> >>>account.
> >>That last one may be suitable as an option, useful for his particular
> >>(somewhat barbaric :) use case, but it's not ok for the general
> >>solution.
> >>
> >>If uid 1000 was delegated the subuids 100000-199999, it should be
> >>able to write a file capability for use by his subuids, but that file
> >>capability must not apply to other subuids.
> >I don't think it's barbaric, I think it's the common use case.  Let me
> >give a more comprehensible answer in terms of docker and IMA.  Lets
> >suppose I'm running docker locally and in a test cloud both with userns
> >enabled.
> >
> >I build an image locally, mapping my uid (1000) to root.  If I begin
> >with a standard base, each of the files has a security.ima signature.
> >  Now I add my layer, which involves updating a file, so I need to write
> >a new signature to security.ima.  Because I'm running user namespaced,
> >the update gets written at security.ima@uid=1000 when I do a docker
> >save.
> >
> >Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> >gives me real uid 4531 and maps that to root.  Execution of the binary
> >fails because it tries to use the underlying signature (in
> >security.ima) as there is no xattr named security.ima@uid=4531
> 
> Yes. An answer would be to have Docker rewrite these on the fly. It
> knows what uid the container was running as and specifically looks
> for security.ima@uid=1000 or security.ima, takes the former if it
> finds, otherwise the latter or nothing.

I know many people hate this answer, but I just want to point out that
on my little laptop, while untarring a 500M images takes 9.5 seconds,
remapping all uids and gids and restoring setuid+setgid on that image
takes .01s.

It's high cpu utilization, and it's not zero time, but it's very fast,
and it's 100% safe (when done the right way, not "sudo domychown").

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 18:34                   ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:34 UTC (permalink / raw)
  To: linux-security-module

Quoting Stefan Berger (stefanb at linux.vnet.ibm.com):
> On 06/23/2017 01:07 PM, James Bottomley wrote:
> >On Fri, 2017-06-23 at 11:30 -0500, Serge E. Hallyn wrote:
> >>Quoting Casey Schaufler (casey at schaufler-ca.com):
> >>>Or maybe just security.ns.capability, taking James' comment into
> >>>account.
> >>That last one may be suitable as an option, useful for his particular
> >>(somewhat barbaric :) use case, but it's not ok for the general
> >>solution.
> >>
> >>If uid 1000 was delegated the subuids 100000-199999, it should be
> >>able to write a file capability for use by his subuids, but that file
> >>capability must not apply to other subuids.
> >I don't think it's barbaric, I think it's the common use case.  Let me
> >give a more comprehensible answer in terms of docker and IMA.  Lets
> >suppose I'm running docker locally and in a test cloud both with userns
> >enabled.
> >
> >I build an image locally, mapping my uid (1000) to root.  If I begin
> >with a standard base, each of the files has a security.ima signature.
> >  Now I add my layer, which involves updating a file, so I need to write
> >a new signature to security.ima.  Because I'm running user namespaced,
> >the update gets written at security.ima at uid=1000 when I do a docker
> >save.
> >
> >Now supposing I deploy that image to a cloud.  As a tenant, the cloud
> >gives me real uid 4531 and maps that to root.  Execution of the binary
> >fails because it tries to use the underlying signature (in
> >security.ima) as there is no xattr named security.ima at uid=4531
> 
> Yes. An answer would be to have Docker rewrite these on the fly. It
> knows what uid the container was running as and specifically looks
> for security.ima at uid=1000 or security.ima, takes the former if it
> finds, otherwise the latter or nothing.

I know many people hate this answer, but I just want to point out that
on my little laptop, while untarring a 500M images takes 9.5 seconds,
remapping all uids and gids and restoring setuid+setgid on that image
takes .01s.

It's high cpu utilization, and it's not zero time, but it's very fast,
and it's 100% safe (when done the right way, not "sudo domychown").

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]             ` <3404c486-c848-3283-50f7-2283cb631e8e-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2017-06-23 18:35               ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:35 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	Casey Schaufler, lkp-JC7UmRfGjtg

Quoting Stefan Berger (stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> >On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> >>Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> >>>On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> >>><stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> >>>>This series of patches primary goal is to enable file capabilities
> >>>>in user namespaces without affecting the file capabilities that are
> >>>>effective on the host. This is to prevent that any unprivileged user
> >>>>on the host maps his own uid to root in a private namespace, writes
> >>>>the xattr, and executes the file with privilege on the host.
> >>>>
> >>>>We achieve this goal by writing extended attributes with a different
> >>>>name when a user namespace is used. If for example the root user
> >>>>in a user namespace writes the security.capability xattr, the name
> >>>>of the xattr that is actually written is encoded as
> >>>>security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >>>>When listing the xattrs on the host, the existing security.capability
> >>>>as well as the security.capability@uid=1000 will be shown. Inside the
> >>>>namespace only 'security.capability', with the value of
> >>>>security.capability@uid=1000, is visible.
> >>>>
> >>>Am I the only one who thinks that suffix is perhaps not the best grammar
> >>>to use for this namespace?
> >>You're the only one to have mentioned it so far.
> >>
> >>>xattrs are clearly namespaced by prefix, so it seems right to me to keep
> >>>it that way - define a new special xattr namespace "ns" and only if that
> >>>prefix exists, the @uid suffix will be parsed.
> >>>This could be either  ns.security.capability@uid=1000 or
> >>>ns@uid=1000.security.capability. The latter seems more correct to me,
> >>>because then we will be able to namespace any xattr without having to
> >>>protect from "unprivileged xattr injection", i.e.:
> >>>setfattr -n "user.whatever.foo@uid=0"
> >>I like it for simplifying the parser code.  One concern I have is that,
> >>since ns.* is currently not gated, one could write ns.* on an older
> >>kernel and then exploit it on a newer one.
> >security.ns.capability@uid=1000, then?
> 
> Imo, '.ns' is redundant and 'encoded' in the '@'.

So how about
	security.@uid=1000@@capability ?

Maybe it's not worth it.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 18:08             ` Stefan Berger
@ 2017-06-23 18:35               ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:35 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Casey Schaufler, Serge E. Hallyn, Amir Goldstein,
	Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Tycho Andersen, James Bottomley,
	christian.brauner, Vivek Goyal, LSM List

Quoting Stefan Berger (stefanb@linux.vnet.ibm.com):
> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> >On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> >>Quoting Amir Goldstein (amir73il@gmail.com):
> >>>On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> >>><stefanb@linux.vnet.ibm.com> wrote:
> >>>>This series of patches primary goal is to enable file capabilities
> >>>>in user namespaces without affecting the file capabilities that are
> >>>>effective on the host. This is to prevent that any unprivileged user
> >>>>on the host maps his own uid to root in a private namespace, writes
> >>>>the xattr, and executes the file with privilege on the host.
> >>>>
> >>>>We achieve this goal by writing extended attributes with a different
> >>>>name when a user namespace is used. If for example the root user
> >>>>in a user namespace writes the security.capability xattr, the name
> >>>>of the xattr that is actually written is encoded as
> >>>>security.capability@uid=1000 for root mapped to uid 1000 on the host.
> >>>>When listing the xattrs on the host, the existing security.capability
> >>>>as well as the security.capability@uid=1000 will be shown. Inside the
> >>>>namespace only 'security.capability', with the value of
> >>>>security.capability@uid=1000, is visible.
> >>>>
> >>>Am I the only one who thinks that suffix is perhaps not the best grammar
> >>>to use for this namespace?
> >>You're the only one to have mentioned it so far.
> >>
> >>>xattrs are clearly namespaced by prefix, so it seems right to me to keep
> >>>it that way - define a new special xattr namespace "ns" and only if that
> >>>prefix exists, the @uid suffix will be parsed.
> >>>This could be either  ns.security.capability@uid=1000 or
> >>>ns@uid=1000.security.capability. The latter seems more correct to me,
> >>>because then we will be able to namespace any xattr without having to
> >>>protect from "unprivileged xattr injection", i.e.:
> >>>setfattr -n "user.whatever.foo@uid=0"
> >>I like it for simplifying the parser code.  One concern I have is that,
> >>since ns.* is currently not gated, one could write ns.* on an older
> >>kernel and then exploit it on a newer one.
> >security.ns.capability@uid=1000, then?
> 
> Imo, '.ns' is redundant and 'encoded' in the '@'.

So how about
	security.@uid=1000@@capability ?

Maybe it's not worth it.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 18:35               ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:35 UTC (permalink / raw)
  To: linux-security-module

Quoting Stefan Berger (stefanb at linux.vnet.ibm.com):
> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
> >On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
> >>Quoting Amir Goldstein (amir73il at gmail.com):
> >>>On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> >>><stefanb@linux.vnet.ibm.com> wrote:
> >>>>This series of patches primary goal is to enable file capabilities
> >>>>in user namespaces without affecting the file capabilities that are
> >>>>effective on the host. This is to prevent that any unprivileged user
> >>>>on the host maps his own uid to root in a private namespace, writes
> >>>>the xattr, and executes the file with privilege on the host.
> >>>>
> >>>>We achieve this goal by writing extended attributes with a different
> >>>>name when a user namespace is used. If for example the root user
> >>>>in a user namespace writes the security.capability xattr, the name
> >>>>of the xattr that is actually written is encoded as
> >>>>security.capability at uid=1000 for root mapped to uid 1000 on the host.
> >>>>When listing the xattrs on the host, the existing security.capability
> >>>>as well as the security.capability at uid=1000 will be shown. Inside the
> >>>>namespace only 'security.capability', with the value of
> >>>>security.capability at uid=1000, is visible.
> >>>>
> >>>Am I the only one who thinks that suffix is perhaps not the best grammar
> >>>to use for this namespace?
> >>You're the only one to have mentioned it so far.
> >>
> >>>xattrs are clearly namespaced by prefix, so it seems right to me to keep
> >>>it that way - define a new special xattr namespace "ns" and only if that
> >>>prefix exists, the @uid suffix will be parsed.
> >>>This could be either  ns.security.capability at uid=1000 or
> >>>ns at uid=1000.security.capability. The latter seems more correct to me,
> >>>because then we will be able to namespace any xattr without having to
> >>>protect from "unprivileged xattr injection", i.e.:
> >>>setfattr -n "user.whatever.foo at uid=0"
> >>I like it for simplifying the parser code.  One concern I have is that,
> >>since ns.* is currently not gated, one could write ns.* on an older
> >>kernel and then exploit it on a newer one.
> >security.ns.capability at uid=1000, then?
> 
> Imo, '.ns' is redundant and 'encoded' in the '@'.

So how about
	security. at uid=1000@@capability ?

Maybe it's not worth it.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]         ` <87efuaip08.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-06-23 18:39           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: lkp-JC7UmRfGjtg,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> Even with one xattr of any type there is something appealing about
> putting the logic that limits that xattr to a namespace in the name.  As

Exactly.  That's the idea - from Stefan - that I thought was a worthwhile
improvement over my own previous patch, which puts the logic in the value.
Most of the complaints raised so far about this patchset are just as valid (or
invalid) against my previous patch, but I was particularly interested in
thoughts on this approach versus mine.

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 17:37         ` Eric W. Biederman
@ 2017-06-23 18:39           ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Bottomley, Serge E. Hallyn, zohar, containers,
	linux-kernel, xiaolong.ye, linux-security-module, lkp

Quoting Eric W. Biederman (ebiederm@xmission.com):
> Even with one xattr of any type there is something appealing about
> putting the logic that limits that xattr to a namespace in the name.  As

Exactly.  That's the idea - from Stefan - that I thought was a worthwhile
improvement over my own previous patch, which puts the logic in the value.
Most of the complaints raised so far about this patchset are just as valid (or
invalid) against my previous patch, but I was particularly interested in
thoughts on this approach versus mine.

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 18:39           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 18:39 UTC (permalink / raw)
  To: linux-security-module

Quoting Eric W. Biederman (ebiederm at xmission.com):
> Even with one xattr of any type there is something appealing about
> putting the logic that limits that xattr to a namespace in the name.  As

Exactly.  That's the idea - from Stefan - that I thought was a worthwhile
improvement over my own previous patch, which puts the logic in the value.
Most of the complaints raised so far about this patchset are just as valid (or
invalid) against my previous patch, but I was particularly interested in
thoughts on this approach versus mine.

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-22 18:59 ` Stefan Berger
  (?)
  (?)
@ 2017-06-23 20:09     ` Vivek Goyal
  -1 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:09 UTC (permalink / raw)
  To: Stefan Berger
  Cc: lkp-JC7UmRfGjtg,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.

Hi Stefan,

Got a question. If child usernamespace sets a
security.capability@uid=1000, can any of the parent namespace remove it?

IOW, I set capability from usernamespace and tried to remove it from
host and that failed. Is that expected.

# Inside usernamespce
$setcap cat_net_raw+ep foo.txt

# outside user namespace
$listxattr foo.txt
 xattr: security.capability@uid=1000
 xattr: security.selinux

# outside user namespace
setfattr -x security.capability@uid foo.txt
setfattr: foo.txt: Invalid argument

Doing a strace shows removexattr() failed. May this will need fixing?

removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
(Invalid argument)

Vivek

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:09     ` Vivek Goyal
  0 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:09 UTC (permalink / raw)
  To: Stefan Berger
  Cc: ebiederm, containers, lkp, xiaolong.ye, linux-kernel, zohar,
	serge, tycho, James.Bottomley, christian.brauner, amir73il,
	linux-security-module

On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability@uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability@uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability@uid=1000, is visible.

Hi Stefan,

Got a question. If child usernamespace sets a
security.capability@uid=1000, can any of the parent namespace remove it?

IOW, I set capability from usernamespace and tried to remove it from
host and that failed. Is that expected.

# Inside usernamespce
$setcap cat_net_raw+ep foo.txt

# outside user namespace
$listxattr foo.txt
 xattr: security.capability@uid=1000
 xattr: security.selinux

# outside user namespace
setfattr -x security.capability@uid foo.txt
setfattr: foo.txt: Invalid argument

Doing a strace shows removexattr() failed. May this will need fixing?

removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
(Invalid argument)

Vivek

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:09     ` Vivek Goyal
  0 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:09 UTC (permalink / raw)
  To: linux-security-module

On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability at uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability at uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability at uid=1000, is visible.

Hi Stefan,

Got a question. If child usernamespace sets a
security.capability at uid=1000, can any of the parent namespace remove it?

IOW, I set capability from usernamespace and tried to remove it from
host and that failed. Is that expected.

# Inside usernamespce
$setcap cat_net_raw+ep foo.txt

# outside user namespace
$listxattr foo.txt
 xattr: security.capability at uid=1000
 xattr: security.selinux

# outside user namespace
setfattr -x security.capability at uid foo.txt
setfattr: foo.txt: Invalid argument

Doing a strace shows removexattr() failed. May this will need fixing?

removexattr("testfile.txt", "security.capability at uid") = -1 EINVAL
(Invalid argument)

Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:09     ` Vivek Goyal
  0 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:09 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1719 bytes --]

On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> This series of patches primary goal is to enable file capabilities
> in user namespaces without affecting the file capabilities that are
> effective on the host. This is to prevent that any unprivileged user
> on the host maps his own uid to root in a private namespace, writes
> the xattr, and executes the file with privilege on the host.
> 
> We achieve this goal by writing extended attributes with a different
> name when a user namespace is used. If for example the root user
> in a user namespace writes the security.capability xattr, the name
> of the xattr that is actually written is encoded as
> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
> When listing the xattrs on the host, the existing security.capability
> as well as the security.capability(a)uid=1000 will be shown. Inside the
> namespace only 'security.capability', with the value of
> security.capability(a)uid=1000, is visible.

Hi Stefan,

Got a question. If child usernamespace sets a
security.capability(a)uid=1000, can any of the parent namespace remove it?

IOW, I set capability from usernamespace and tried to remove it from
host and that failed. Is that expected.

# Inside usernamespce
$setcap cat_net_raw+ep foo.txt

# outside user namespace
$listxattr foo.txt
 xattr: security.capability(a)uid=1000
 xattr: security.selinux

# outside user namespace
setfattr -x security.capability(a)uid foo.txt
setfattr: foo.txt: Invalid argument

Doing a strace shows removexattr() failed. May this will need fixing?

removexattr("testfile.txt", "security.capability(a)uid") = -1 EINVAL
(Invalid argument)

Vivek

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]     ` <20170623200956.GB24779-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2017-06-23 20:17       ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 20:17 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

Quoting Vivek Goyal (vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> 
> Hi Stefan,
> 
> Got a question. If child usernamespace sets a
> security.capability@uid=1000, can any of the parent namespace remove it?
> 
> IOW, I set capability from usernamespace and tried to remove it from
> host and that failed. Is that expected.
> 
> # Inside usernamespce
> $setcap cat_net_raw+ep foo.txt
> 
> # outside user namespace
> $listxattr foo.txt
>  xattr: security.capability@uid=1000
>  xattr: security.selinux
> 
> # outside user namespace
> setfattr -x security.capability@uid foo.txt
> setfattr: foo.txt: Invalid argument
> 
> Doing a strace shows removexattr() failed. May this will need fixing?
> 
> removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
> (Invalid argument)

That's not the right xattr, though, does

	setfattr -x security.capability@uid=1000 foo.txt

work?

If you are in fact uid=1000 then that should work.  If you are uid 1001,
and 1000 was delegated to you, then you'll need to create a transient
userns with uid 1000 mapped into it in order to delete it (so that you
have privilege over the uid).

If that doesn't work, then it's a bug.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 20:09     ` Vivek Goyal
@ 2017-06-23 20:17       ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 20:17 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Stefan Berger, ebiederm, containers, lkp, xiaolong.ye,
	linux-kernel, zohar, serge, tycho, James.Bottomley,
	christian.brauner, amir73il, linux-security-module

Quoting Vivek Goyal (vgoyal@redhat.com):
> On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> 
> Hi Stefan,
> 
> Got a question. If child usernamespace sets a
> security.capability@uid=1000, can any of the parent namespace remove it?
> 
> IOW, I set capability from usernamespace and tried to remove it from
> host and that failed. Is that expected.
> 
> # Inside usernamespce
> $setcap cat_net_raw+ep foo.txt
> 
> # outside user namespace
> $listxattr foo.txt
>  xattr: security.capability@uid=1000
>  xattr: security.selinux
> 
> # outside user namespace
> setfattr -x security.capability@uid foo.txt
> setfattr: foo.txt: Invalid argument
> 
> Doing a strace shows removexattr() failed. May this will need fixing?
> 
> removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
> (Invalid argument)

That's not the right xattr, though, does

	setfattr -x security.capability@uid=1000 foo.txt

work?

If you are in fact uid=1000 then that should work.  If you are uid 1001,
and 1000 was delegated to you, then you'll need to create a transient
userns with uid 1000 mapped into it in order to delete it (so that you
have privilege over the uid).

If that doesn't work, then it's a bug.

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:17       ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 20:17 UTC (permalink / raw)
  To: linux-security-module

Quoting Vivek Goyal (vgoyal at redhat.com):
> On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> > 
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability at uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability at uid=1000, is visible.
> 
> Hi Stefan,
> 
> Got a question. If child usernamespace sets a
> security.capability at uid=1000, can any of the parent namespace remove it?
> 
> IOW, I set capability from usernamespace and tried to remove it from
> host and that failed. Is that expected.
> 
> # Inside usernamespce
> $setcap cat_net_raw+ep foo.txt
> 
> # outside user namespace
> $listxattr foo.txt
>  xattr: security.capability at uid=1000
>  xattr: security.selinux
> 
> # outside user namespace
> setfattr -x security.capability at uid foo.txt
> setfattr: foo.txt: Invalid argument
> 
> Doing a strace shows removexattr() failed. May this will need fixing?
> 
> removexattr("testfile.txt", "security.capability at uid") = -1 EINVAL
> (Invalid argument)

That's not the right xattr, though, does

	setfattr -x security.capability at uid=1000 foo.txt

work?

If you are in fact uid=1000 then that should work.  If you are uid 1001,
and 1000 was delegated to you, then you'll need to create a transient
userns with uid 1000 mapped into it in order to delete it (so that you
have privilege over the uid).

If that doesn't work, then it's a bug.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 3/3] Enable security.selinux in user namespaces
  2017-06-22 18:59   ` Stefan Berger
  (?)
  (?)
@ 2017-06-23 20:30       ` Stephen Smalley
  -1 siblings, 0 replies; 180+ messages in thread
From: Stephen Smalley @ 2017-06-23 20:30 UTC (permalink / raw)
  To: Stefan Berger, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Paul Moore, lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	selinux-+05T5uksL2qpZYMLLGbcSA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> Before the current modifications, SELinux extended attributes were
> visible inside the user namespace but changes in patch 1 hid them.
> This patch enables security.selinux in user namespaces and allows
> them to be written to in the same way as security.capability.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
> ---
>  fs/xattr.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 045be85..37686ee 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
> *name, int mask)
>   */
>  static const char *const userns_xattrs[] = {
>  	XATTR_NAME_CAPS,
> +	XATTR_NAME_SELINUX,
>  	NULL
>  };
>  

(cc SELinux maintainers, curiously omitted from these patches)

I don't think this works for SELinux. You don't deal with actually
supporting multiple security.selinux attributes within SELinux itself
(and I'm not asking you to do so), and without such support, this can't
operate as intended. With these patches applied, IIUC, a setxattr() of
security.selinux within a userns will end up setting only security.seli
nux@uid=1000 on disk, but will then tell SELinux to update its in-core
security label to the new value (via security_inode_post_setxattr).
Meanwhile, on a subsequent getxattr(), you'll call
security_inode_getsecurity() with the security.selinux@uid=1000 name,
which will always fail because SELinux doesn't know anything about your
new scheme, and then you'll call the filesystem handler and returns its
value, which is no longer connected in any way to the actual label
being used by SELinux.  Also, SELinux itself makes calls to
__vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
name remapping is correct in those cases.

You also can't hide security.selinux within user namespaces.  Today
userspace can get and set security.selinux attributes within user
namespaces (if allowed by policy), and further can specify the label to
use for new files via /proc/self/attr/fscreate, which unsurprisingly
isn't addressed by your changes.  Changing that would be a userspace
break.

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-23 20:30       ` Stephen Smalley
  0 siblings, 0 replies; 180+ messages in thread
From: Stephen Smalley @ 2017-06-23 20:30 UTC (permalink / raw)
  To: Stefan Berger, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, vgoyal, amir73il,
	linux-security-module, Paul Moore, selinux

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> Before the current modifications, SELinux extended attributes were
> visible inside the user namespace but changes in patch 1 hid them.
> This patch enables security.selinux in user namespaces and allows
> them to be written to in the same way as security.capability.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
> ---
>  fs/xattr.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 045be85..37686ee 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
> *name, int mask)
>   */
>  static const char *const userns_xattrs[] = {
>  	XATTR_NAME_CAPS,
> +	XATTR_NAME_SELINUX,
>  	NULL
>  };
>  

(cc SELinux maintainers, curiously omitted from these patches)

I don't think this works for SELinux. You don't deal with actually
supporting multiple security.selinux attributes within SELinux itself
(and I'm not asking you to do so), and without such support, this can't
operate as intended. With these patches applied, IIUC, a setxattr() of
security.selinux within a userns will end up setting only security.seli
nux@uid=1000 on disk, but will then tell SELinux to update its in-core
security label to the new value (via security_inode_post_setxattr).
Meanwhile, on a subsequent getxattr(), you'll call
security_inode_getsecurity() with the security.selinux@uid=1000 name,
which will always fail because SELinux doesn't know anything about your
new scheme, and then you'll call the filesystem handler and returns its
value, which is no longer connected in any way to the actual label
being used by SELinux.  Also, SELinux itself makes calls to
__vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
name remapping is correct in those cases.

You also can't hide security.selinux within user namespaces.  Today
userspace can get and set security.selinux attributes within user
namespaces (if allowed by policy), and further can specify the label to
use for new files via /proc/self/attr/fscreate, which unsurprisingly
isn't addressed by your changes.  Changing that would be a userspace
break.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-23 20:30       ` Stephen Smalley
  0 siblings, 0 replies; 180+ messages in thread
From: Stephen Smalley @ 2017-06-23 20:30 UTC (permalink / raw)
  To: linux-security-module

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> Before the current modifications, SELinux extended attributes were
> visible inside the user namespace but changes in patch 1 hid them.
> This patch enables security.selinux in user namespaces and allows
> them to be written to in the same way as security.capability.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
> ---
> ?fs/xattr.c | 1 +
> ?1 file changed, 1 insertion(+)
> 
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 045be85..37686ee 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
> *name, int mask)
> ? */
> ?static const char *const userns_xattrs[] = {
> ?	XATTR_NAME_CAPS,
> +	XATTR_NAME_SELINUX,
> ?	NULL
> ?};
> ?

(cc SELinux maintainers, curiously omitted from these patches)

I don't think this works for SELinux. You don't deal with actually
supporting multiple security.selinux attributes within SELinux itself
(and I'm not asking you to do so), and without such support, this can't
operate as intended. With these patches applied, IIUC, a setxattr() of
security.selinux within a userns will end up setting only security.seli
nux at uid=1000 on disk, but will then tell SELinux to update its in-core
security label to the new value (via security_inode_post_setxattr).
Meanwhile, on a subsequent getxattr(), you'll call
security_inode_getsecurity() with the security.selinux at uid=1000 name,
which will always fail because SELinux doesn't know anything about your
new scheme, and then you'll call the filesystem handler and returns its
value, which is no longer connected in any way to the actual label
being used by SELinux.  Also, SELinux itself makes calls to
__vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
name remapping is correct in those cases.

You also can't hide security.selinux within user namespaces.  Today
userspace can get and set security.selinux attributes within user
namespaces (if allowed by policy), and further can specify the label to
use for new files via /proc/self/attr/fscreate, which unsurprisingly
isn't addressed by your changes.  Changing that would be a userspace
break.

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-23 20:30       ` Stephen Smalley
  0 siblings, 0 replies; 180+ messages in thread
From: Stephen Smalley @ 2017-06-23 20:30 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
> Before the current modifications, SELinux extended attributes were
> visible inside the user namespace but changes in patch 1 hid them.
> This patch enables security.selinux in user namespaces and allows
> them to be written to in the same way as security.capability.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
> ---
>  fs/xattr.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 045be85..37686ee 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
> *name, int mask)
>   */
>  static const char *const userns_xattrs[] = {
>  	XATTR_NAME_CAPS,
> +	XATTR_NAME_SELINUX,
>  	NULL
>  };
>  

(cc SELinux maintainers, curiously omitted from these patches)

I don't think this works for SELinux. You don't deal with actually
supporting multiple security.selinux attributes within SELinux itself
(and I'm not asking you to do so), and without such support, this can't
operate as intended. With these patches applied, IIUC, a setxattr() of
security.selinux within a userns will end up setting only security.seli
nux(a)uid=1000 on disk, but will then tell SELinux to update its in-core
security label to the new value (via security_inode_post_setxattr).
Meanwhile, on a subsequent getxattr(), you'll call
security_inode_getsecurity() with the security.selinux(a)uid=1000 name,
which will always fail because SELinux doesn't know anything about your
new scheme, and then you'll call the filesystem handler and returns its
value, which is no longer connected in any way to the actual label
being used by SELinux.  Also, SELinux itself makes calls to
__vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
name remapping is correct in those cases.

You also can't hide security.selinux within user namespaces.  Today
userspace can get and set security.selinux attributes within user
namespaces (if allowed by policy), and further can specify the label to
use for new files via /proc/self/attr/fscreate, which unsurprisingly
isn't addressed by your changes.  Changing that would be a userspace
break.


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 18:35               ` Serge E. Hallyn
  (?)
  (?)
@ 2017-06-23 20:30                   ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 20:30 UTC (permalink / raw)
  To: Serge E. Hallyn, Stefan Berger
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, lkp-JC7UmRfGjtg

On 6/23/2017 11:35 AM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability@uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability@uid=1000 or
>>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo@uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability@uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security.@uid=1000@@capability ?

You're back to messing up the final component of the
attribute name. If you want a namespace component, keep
it separate. I disagree with the ".ns" being redundant.
It's descriptive.

	security.ns@uid=1000@@.capability.

looks right to me.

>
> Maybe it's not worth it.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:30                   ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 20:30 UTC (permalink / raw)
  To: Serge E. Hallyn, Stefan Berger
  Cc: Amir Goldstein, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	James Bottomley, christian.brauner, Vivek Goyal, LSM List

On 6/23/2017 11:35 AM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb@linux.vnet.ibm.com):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il@gmail.com):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability@uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability@uid=1000 or
>>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo@uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability@uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security.@uid=1000@@capability ?

You're back to messing up the final component of the
attribute name. If you want a namespace component, keep
it separate. I disagree with the ".ns" being redundant.
It's descriptive.

	security.ns@uid=1000@@.capability.

looks right to me.

>
> Maybe it's not worth it.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:30                   ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 20:30 UTC (permalink / raw)
  To: linux-security-module

On 6/23/2017 11:35 AM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb at linux.vnet.ibm.com):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il at gmail.com):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability at uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability at uid=1000 or
>>>>> ns at uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo at uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability at uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security. at uid=1000@@capability ?

You're back to messing up the final component of the
attribute name. If you want a namespace component, keep
it separate. I disagree with the ".ns" being redundant.
It's descriptive.

	security.ns at uid=1000@@.capability.

looks right to me.

>
> Maybe it's not worth it.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:30                   ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 20:30 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2964 bytes --]

On 6/23/2017 11:35 AM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb(a)linux.vnet.ibm.com):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il(a)gmail.com):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability(a)uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability(a)uid=1000 or
>>>>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo(a)uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability(a)uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security.(a)uid=1000@@capability ?

You're back to messing up the final component of the
attribute name. If you want a namespace component, keep
it separate. I disagree with the ".ns" being redundant.
It's descriptive.

	security.ns(a)uid=1000@@.capability.

looks right to me.

>
> Maybe it's not worth it.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]       ` <20170623201723.GA22857-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-23 20:36         ` Vivek Goyal
  0 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:36 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> Quoting Vivek Goyal (vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file capabilities
> > > in user namespaces without affecting the file capabilities that are
> > > effective on the host. This is to prevent that any unprivileged user
> > > on the host maps his own uid to root in a private namespace, writes
> > > the xattr, and executes the file with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a different
> > > name when a user namespace is used. If for example the root user
> > > in a user namespace writes the security.capability xattr, the name
> > > of the xattr that is actually written is encoded as
> > > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > > When listing the xattrs on the host, the existing security.capability
> > > as well as the security.capability@uid=1000 will be shown. Inside the
> > > namespace only 'security.capability', with the value of
> > > security.capability@uid=1000, is visible.
> > 
> > Hi Stefan,
> > 
> > Got a question. If child usernamespace sets a
> > security.capability@uid=1000, can any of the parent namespace remove it?
> > 
> > IOW, I set capability from usernamespace and tried to remove it from
> > host and that failed. Is that expected.
> > 
> > # Inside usernamespce
> > $setcap cat_net_raw+ep foo.txt
> > 
> > # outside user namespace
> > $listxattr foo.txt
> >  xattr: security.capability@uid=1000
> >  xattr: security.selinux
> > 
> > # outside user namespace
> > setfattr -x security.capability@uid foo.txt
> > setfattr: foo.txt: Invalid argument
> > 
> > Doing a strace shows removexattr() failed. May this will need fixing?
> > 
> > removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
> > (Invalid argument)
> 
> That's not the right xattr, though, does
> 
> 	setfattr -x security.capability@uid=1000 foo.txt
> 
> work?

Yep, that works (as root on host). My bad.

> 
> If you are in fact uid=1000 then that should work.

Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
issues.

$ ll testfile.txt 
-rw-r--r--. 1 vivek vivek 0 Jun 23 15:44 testfile.txt

$listxattr testfile.txt
xattr: security.capability@uid=1000
xattr: security.selinux

$id
uid=1000(vivek) gid=1000(vivek) groups=1000(vivek)
context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

$setfattr -x security.capability@uid=1000 testfile.txt 
setfattr: testfile.txt: Operation not permitted

I had to launch a user namespace with 1000 mapped to 0 inside user
namespace and then "setfattr -x security.capability testfile.txt" worked.

> If you are uid 1001,
> and 1000 was delegated to you, then you'll need to create a transient
> userns with uid 1000 mapped into it in order to delete it (so that you
> have privilege over the uid).

Will give this a try.

Vivek
> 
> If that doesn't work, then it's a bug.
> 
> -serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 20:17       ` Serge E. Hallyn
  (?)
@ 2017-06-23 20:36         ` Vivek Goyal
  -1 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:36 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Stefan Berger, ebiederm, containers, lkp, xiaolong.ye,
	linux-kernel, zohar, tycho, James.Bottomley, christian.brauner,
	amir73il, linux-security-module

On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> Quoting Vivek Goyal (vgoyal@redhat.com):
> > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file capabilities
> > > in user namespaces without affecting the file capabilities that are
> > > effective on the host. This is to prevent that any unprivileged user
> > > on the host maps his own uid to root in a private namespace, writes
> > > the xattr, and executes the file with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a different
> > > name when a user namespace is used. If for example the root user
> > > in a user namespace writes the security.capability xattr, the name
> > > of the xattr that is actually written is encoded as
> > > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > > When listing the xattrs on the host, the existing security.capability
> > > as well as the security.capability@uid=1000 will be shown. Inside the
> > > namespace only 'security.capability', with the value of
> > > security.capability@uid=1000, is visible.
> > 
> > Hi Stefan,
> > 
> > Got a question. If child usernamespace sets a
> > security.capability@uid=1000, can any of the parent namespace remove it?
> > 
> > IOW, I set capability from usernamespace and tried to remove it from
> > host and that failed. Is that expected.
> > 
> > # Inside usernamespce
> > $setcap cat_net_raw+ep foo.txt
> > 
> > # outside user namespace
> > $listxattr foo.txt
> >  xattr: security.capability@uid=1000
> >  xattr: security.selinux
> > 
> > # outside user namespace
> > setfattr -x security.capability@uid foo.txt
> > setfattr: foo.txt: Invalid argument
> > 
> > Doing a strace shows removexattr() failed. May this will need fixing?
> > 
> > removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
> > (Invalid argument)
> 
> That's not the right xattr, though, does
> 
> 	setfattr -x security.capability@uid=1000 foo.txt
> 
> work?

Yep, that works (as root on host). My bad.

> 
> If you are in fact uid=1000 then that should work.

Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
issues.

$ ll testfile.txt 
-rw-r--r--. 1 vivek vivek 0 Jun 23 15:44 testfile.txt

$listxattr testfile.txt
xattr: security.capability@uid=1000
xattr: security.selinux

$id
uid=1000(vivek) gid=1000(vivek) groups=1000(vivek)
context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

$setfattr -x security.capability@uid=1000 testfile.txt 
setfattr: testfile.txt: Operation not permitted

I had to launch a user namespace with 1000 mapped to 0 inside user
namespace and then "setfattr -x security.capability testfile.txt" worked.

> If you are uid 1001,
> and 1000 was delegated to you, then you'll need to create a transient
> userns with uid 1000 mapped into it in order to delete it (so that you
> have privilege over the uid).

Will give this a try.

Vivek
> 
> If that doesn't work, then it's a bug.
> 
> -serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:36         ` Vivek Goyal
  0 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:36 UTC (permalink / raw)
  To: linux-security-module

On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> Quoting Vivek Goyal (vgoyal at redhat.com):
> > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file capabilities
> > > in user namespaces without affecting the file capabilities that are
> > > effective on the host. This is to prevent that any unprivileged user
> > > on the host maps his own uid to root in a private namespace, writes
> > > the xattr, and executes the file with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a different
> > > name when a user namespace is used. If for example the root user
> > > in a user namespace writes the security.capability xattr, the name
> > > of the xattr that is actually written is encoded as
> > > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > > When listing the xattrs on the host, the existing security.capability
> > > as well as the security.capability at uid=1000 will be shown. Inside the
> > > namespace only 'security.capability', with the value of
> > > security.capability at uid=1000, is visible.
> > 
> > Hi Stefan,
> > 
> > Got a question. If child usernamespace sets a
> > security.capability at uid=1000, can any of the parent namespace remove it?
> > 
> > IOW, I set capability from usernamespace and tried to remove it from
> > host and that failed. Is that expected.
> > 
> > # Inside usernamespce
> > $setcap cat_net_raw+ep foo.txt
> > 
> > # outside user namespace
> > $listxattr foo.txt
> >  xattr: security.capability at uid=1000
> >  xattr: security.selinux
> > 
> > # outside user namespace
> > setfattr -x security.capability at uid foo.txt
> > setfattr: foo.txt: Invalid argument
> > 
> > Doing a strace shows removexattr() failed. May this will need fixing?
> > 
> > removexattr("testfile.txt", "security.capability at uid") = -1 EINVAL
> > (Invalid argument)
> 
> That's not the right xattr, though, does
> 
> 	setfattr -x security.capability at uid=1000 foo.txt
> 
> work?

Yep, that works (as root on host). My bad.

> 
> If you are in fact uid=1000 then that should work.

Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
issues.

$ ll testfile.txt 
-rw-r--r--. 1 vivek vivek 0 Jun 23 15:44 testfile.txt

$listxattr testfile.txt
xattr: security.capability at uid=1000
xattr: security.selinux

$id
uid=1000(vivek) gid=1000(vivek) groups=1000(vivek)
context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

$setfattr -x security.capability at uid=1000 testfile.txt 
setfattr: testfile.txt: Operation not permitted

I had to launch a user namespace with 1000 mapped to 0 inside user
namespace and then "setfattr -x security.capability testfile.txt" worked.

> If you are uid 1001,
> and 1000 was delegated to you, then you'll need to create a transient
> userns with uid 1000 mapped into it in order to delete it (so that you
> have privilege over the uid).

Will give this a try.

Vivek
> 
> If that doesn't work, then it's a bug.
> 
> -serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:36         ` Vivek Goyal
  0 siblings, 0 replies; 180+ messages in thread
From: Vivek Goyal @ 2017-06-23 20:36 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3138 bytes --]

On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> Quoting Vivek Goyal (vgoyal(a)redhat.com):
> > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > This series of patches primary goal is to enable file capabilities
> > > in user namespaces without affecting the file capabilities that are
> > > effective on the host. This is to prevent that any unprivileged user
> > > on the host maps his own uid to root in a private namespace, writes
> > > the xattr, and executes the file with privilege on the host.
> > > 
> > > We achieve this goal by writing extended attributes with a different
> > > name when a user namespace is used. If for example the root user
> > > in a user namespace writes the security.capability xattr, the name
> > > of the xattr that is actually written is encoded as
> > > security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
> > > When listing the xattrs on the host, the existing security.capability
> > > as well as the security.capability(a)uid=1000 will be shown. Inside the
> > > namespace only 'security.capability', with the value of
> > > security.capability(a)uid=1000, is visible.
> > 
> > Hi Stefan,
> > 
> > Got a question. If child usernamespace sets a
> > security.capability(a)uid=1000, can any of the parent namespace remove it?
> > 
> > IOW, I set capability from usernamespace and tried to remove it from
> > host and that failed. Is that expected.
> > 
> > # Inside usernamespce
> > $setcap cat_net_raw+ep foo.txt
> > 
> > # outside user namespace
> > $listxattr foo.txt
> >  xattr: security.capability(a)uid=1000
> >  xattr: security.selinux
> > 
> > # outside user namespace
> > setfattr -x security.capability(a)uid foo.txt
> > setfattr: foo.txt: Invalid argument
> > 
> > Doing a strace shows removexattr() failed. May this will need fixing?
> > 
> > removexattr("testfile.txt", "security.capability(a)uid") = -1 EINVAL
> > (Invalid argument)
> 
> That's not the right xattr, though, does
> 
> 	setfattr -x security.capability(a)uid=1000 foo.txt
> 
> work?

Yep, that works (as root on host). My bad.

> 
> If you are in fact uid=1000 then that should work.

Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
issues.

$ ll testfile.txt 
-rw-r--r--. 1 vivek vivek 0 Jun 23 15:44 testfile.txt

$listxattr testfile.txt
xattr: security.capability(a)uid=1000
xattr: security.selinux

$id
uid=1000(vivek) gid=1000(vivek) groups=1000(vivek)
context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

$setfattr -x security.capability(a)uid=1000 testfile.txt 
setfattr: testfile.txt: Operation not permitted

I had to launch a user namespace with 1000 mapped to 0 inside user
namespace and then "setfattr -x security.capability testfile.txt" worked.

> If you are uid 1001,
> and 1000 was delegated to you, then you'll need to create a transient
> userns with uid 1000 mapped into it in order to delete it (so that you
> have privilege over the uid).

Will give this a try.

Vivek
> 
> If that doesn't work, then it's a bug.
> 
> -serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]         ` <20170623203643.GC24779-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2017-06-23 20:51           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 20:51 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, lkp-JC7UmRfGjtg

Quoting Vivek Goyal (vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> > Quoting Vivek Goyal (vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> > > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > > This series of patches primary goal is to enable file capabilities
> > > > in user namespaces without affecting the file capabilities that are
> > > > effective on the host. This is to prevent that any unprivileged user
> > > > on the host maps his own uid to root in a private namespace, writes
> > > > the xattr, and executes the file with privilege on the host.
> > > > 
> > > > We achieve this goal by writing extended attributes with a different
> > > > name when a user namespace is used. If for example the root user
> > > > in a user namespace writes the security.capability xattr, the name
> > > > of the xattr that is actually written is encoded as
> > > > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > > > When listing the xattrs on the host, the existing security.capability
> > > > as well as the security.capability@uid=1000 will be shown. Inside the
> > > > namespace only 'security.capability', with the value of
> > > > security.capability@uid=1000, is visible.
> > > 
> > > Hi Stefan,
> > > 
> > > Got a question. If child usernamespace sets a
> > > security.capability@uid=1000, can any of the parent namespace remove it?
> > > 
> > > IOW, I set capability from usernamespace and tried to remove it from
> > > host and that failed. Is that expected.
> > > 
> > > # Inside usernamespce
> > > $setcap cat_net_raw+ep foo.txt
> > > 
> > > # outside user namespace
> > > $listxattr foo.txt
> > >  xattr: security.capability@uid=1000
> > >  xattr: security.selinux
> > > 
> > > # outside user namespace
> > > setfattr -x security.capability@uid foo.txt
> > > setfattr: foo.txt: Invalid argument
> > > 
> > > Doing a strace shows removexattr() failed. May this will need fixing?
> > > 
> > > removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
> > > (Invalid argument)
> > 
> > That's not the right xattr, though, does
> > 
> > 	setfattr -x security.capability@uid=1000 foo.txt
> > 
> > work?
> 
> Yep, that works (as root on host). My bad.
> 
> > 
> > If you are in fact uid=1000 then that should work.
> 
> Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
> issues.

D'oh, yes, I was thinking wrongly.  You need *privilege* over the uid, meaning
CAP_SETFACL against your user_ns and uid 1000 mapped into the user_ns.  So yeah
just uid 1000 won't suffice.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 20:36         ` Vivek Goyal
@ 2017-06-23 20:51           ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 20:51 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Serge E. Hallyn, Stefan Berger, ebiederm, containers, lkp,
	xiaolong.ye, linux-kernel, zohar, tycho, James.Bottomley,
	christian.brauner, amir73il, linux-security-module

Quoting Vivek Goyal (vgoyal@redhat.com):
> On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> > Quoting Vivek Goyal (vgoyal@redhat.com):
> > > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > > This series of patches primary goal is to enable file capabilities
> > > > in user namespaces without affecting the file capabilities that are
> > > > effective on the host. This is to prevent that any unprivileged user
> > > > on the host maps his own uid to root in a private namespace, writes
> > > > the xattr, and executes the file with privilege on the host.
> > > > 
> > > > We achieve this goal by writing extended attributes with a different
> > > > name when a user namespace is used. If for example the root user
> > > > in a user namespace writes the security.capability xattr, the name
> > > > of the xattr that is actually written is encoded as
> > > > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > > > When listing the xattrs on the host, the existing security.capability
> > > > as well as the security.capability@uid=1000 will be shown. Inside the
> > > > namespace only 'security.capability', with the value of
> > > > security.capability@uid=1000, is visible.
> > > 
> > > Hi Stefan,
> > > 
> > > Got a question. If child usernamespace sets a
> > > security.capability@uid=1000, can any of the parent namespace remove it?
> > > 
> > > IOW, I set capability from usernamespace and tried to remove it from
> > > host and that failed. Is that expected.
> > > 
> > > # Inside usernamespce
> > > $setcap cat_net_raw+ep foo.txt
> > > 
> > > # outside user namespace
> > > $listxattr foo.txt
> > >  xattr: security.capability@uid=1000
> > >  xattr: security.selinux
> > > 
> > > # outside user namespace
> > > setfattr -x security.capability@uid foo.txt
> > > setfattr: foo.txt: Invalid argument
> > > 
> > > Doing a strace shows removexattr() failed. May this will need fixing?
> > > 
> > > removexattr("testfile.txt", "security.capability@uid") = -1 EINVAL
> > > (Invalid argument)
> > 
> > That's not the right xattr, though, does
> > 
> > 	setfattr -x security.capability@uid=1000 foo.txt
> > 
> > work?
> 
> Yep, that works (as root on host). My bad.
> 
> > 
> > If you are in fact uid=1000 then that should work.
> 
> Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
> issues.

D'oh, yes, I was thinking wrongly.  You need *privilege* over the uid, meaning
CAP_SETFACL against your user_ns and uid 1000 mapped into the user_ns.  So yeah
just uid 1000 won't suffice.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 20:51           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-23 20:51 UTC (permalink / raw)
  To: linux-security-module

Quoting Vivek Goyal (vgoyal at redhat.com):
> On Fri, Jun 23, 2017 at 03:17:23PM -0500, Serge E. Hallyn wrote:
> > Quoting Vivek Goyal (vgoyal at redhat.com):
> > > On Thu, Jun 22, 2017 at 02:59:46PM -0400, Stefan Berger wrote:
> > > > This series of patches primary goal is to enable file capabilities
> > > > in user namespaces without affecting the file capabilities that are
> > > > effective on the host. This is to prevent that any unprivileged user
> > > > on the host maps his own uid to root in a private namespace, writes
> > > > the xattr, and executes the file with privilege on the host.
> > > > 
> > > > We achieve this goal by writing extended attributes with a different
> > > > name when a user namespace is used. If for example the root user
> > > > in a user namespace writes the security.capability xattr, the name
> > > > of the xattr that is actually written is encoded as
> > > > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > > > When listing the xattrs on the host, the existing security.capability
> > > > as well as the security.capability at uid=1000 will be shown. Inside the
> > > > namespace only 'security.capability', with the value of
> > > > security.capability at uid=1000, is visible.
> > > 
> > > Hi Stefan,
> > > 
> > > Got a question. If child usernamespace sets a
> > > security.capability at uid=1000, can any of the parent namespace remove it?
> > > 
> > > IOW, I set capability from usernamespace and tried to remove it from
> > > host and that failed. Is that expected.
> > > 
> > > # Inside usernamespce
> > > $setcap cat_net_raw+ep foo.txt
> > > 
> > > # outside user namespace
> > > $listxattr foo.txt
> > >  xattr: security.capability at uid=1000
> > >  xattr: security.selinux
> > > 
> > > # outside user namespace
> > > setfattr -x security.capability at uid foo.txt
> > > setfattr: foo.txt: Invalid argument
> > > 
> > > Doing a strace shows removexattr() failed. May this will need fixing?
> > > 
> > > removexattr("testfile.txt", "security.capability at uid") = -1 EINVAL
> > > (Invalid argument)
> > 
> > That's not the right xattr, though, does
> > 
> > 	setfattr -x security.capability at uid=1000 foo.txt
> > 
> > work?
> 
> Yep, that works (as root on host). My bad.
> 
> > 
> > If you are in fact uid=1000 then that should work.
> 
> Tried setfattr -x as uid 1000 in init_user_ns and that seems to have
> issues.

D'oh, yes, I was thinking wrongly.  You need *privilege* over the uid, meaning
CAP_SETFACL against your user_ns and uid 1000 mapped into the user_ns.  So yeah
just uid 1000 won't suffice.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 18:35               ` Serge E. Hallyn
  (?)
  (?)
@ 2017-06-23 23:09                   ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:09 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	Casey Schaufler, lkp-JC7UmRfGjtg

On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability@uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability@uid=1000 or
>>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo@uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability@uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security.@uid=1000@@capability ?
Ouch.
> Maybe it's not worth it.

So the .ns is there to be able to possibly extend it in another 
dimension in the future, like have '.foo' there at some point?


>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 23:09                   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:09 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Casey Schaufler, Amir Goldstein, Eric W. Biederman,
	Linux Containers, lkp, xiaolong.ye, linux-kernel, Mimi Zohar,
	Tycho Andersen, James Bottomley, christian.brauner, Vivek Goyal,
	LSM List

On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb@linux.vnet.ibm.com):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il@gmail.com):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability@uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability@uid=1000 or
>>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo@uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability@uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security.@uid=1000@@capability ?
Ouch.
> Maybe it's not worth it.

So the .ns is there to be able to possibly extend it in another 
dimension in the future, like have '.foo' there at some point?


>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 23:09                   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:09 UTC (permalink / raw)
  To: linux-security-module

On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb at linux.vnet.ibm.com):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il at gmail.com):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability at uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability at uid=1000 or
>>>>> ns at uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo at uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability at uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security. at uid=1000@@capability ?
Ouch.
> Maybe it's not worth it.

So the .ns is there to be able to possibly extend it in another 
dimension in the future, like have '.foo' there at some point?


>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 23:09                   ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:09 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2632 bytes --]

On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
> Quoting Stefan Berger (stefanb(a)linux.vnet.ibm.com):
>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>> Quoting Amir Goldstein (amir73il(a)gmail.com):
>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>
>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>> name when a user namespace is used. If for example the root user
>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>> of the xattr that is actually written is encoded as
>>>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>>>> namespace only 'security.capability', with the value of
>>>>>> security.capability(a)uid=1000, is visible.
>>>>>>
>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>> to use for this namespace?
>>>> You're the only one to have mentioned it so far.
>>>>
>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>> prefix exists, the @uid suffix will be parsed.
>>>>> This could be either  ns.security.capability(a)uid=1000 or
>>>>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>>>>> because then we will be able to namespace any xattr without having to
>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>> setfattr -n "user.whatever.foo(a)uid=0"
>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>> kernel and then exploit it on a newer one.
>>> security.ns.capability(a)uid=1000, then?
>> Imo, '.ns' is redundant and 'encoded' in the '@'.
> So how about
> 	security.(a)uid=1000@@capability ?
Ouch.
> Maybe it's not worth it.

So the .ns is there to be able to possibly extend it in another 
dimension in the future, like have '.foo' there at some point?


>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 3/3] Enable security.selinux in user namespaces
  2017-06-23 20:30       ` Stephen Smalley
  (?)
  (?)
@ 2017-06-23 23:41           ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:41 UTC (permalink / raw)
  To: Stephen Smalley, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Paul Moore, lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	selinux-+05T5uksL2qpZYMLLGbcSA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 06/23/2017 04:30 PM, Stephen Smalley wrote:
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> Before the current modifications, SELinux extended attributes were
>> visible inside the user namespace but changes in patch 1 hid them.
>> This patch enables security.selinux in user namespaces and allows
>> them to be written to in the same way as security.capability.
>>
>> Signed-off-by: Stefan Berger <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>> ---
>>   fs/xattr.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/xattr.c b/fs/xattr.c
>> index 045be85..37686ee 100644
>> --- a/fs/xattr.c
>> +++ b/fs/xattr.c
>> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
>> *name, int mask)
>>    */
>>   static const char *const userns_xattrs[] = {
>>   	XATTR_NAME_CAPS,
>> +	XATTR_NAME_SELINUX,
>>   	NULL
>>   };
>>   
> (cc SELinux maintainers, curiously omitted from these patches)
>
> I don't think this works for SELinux. You don't deal with actually
> supporting multiple security.selinux attributes within SELinux itself
> (and I'm not asking you to do so), and without such support, this can't
> operate as intended. With these patches applied, IIUC, a setxattr() of
> security.selinux within a userns will end up setting only security.seli
> nux@uid=1000 on disk, but will then tell SELinux to update its in-core
> security label to the new value (via security_inode_post_setxattr).
> Meanwhile, on a subsequent getxattr(), you'll call
> security_inode_getsecurity() with the security.selinux@uid=1000 name,
> which will always fail because SELinux doesn't know anything about your
> new scheme, and then you'll call the filesystem handler and returns its
> value, which is no longer connected in any way to the actual label
> being used by SELinux.  Also, SELinux itself makes calls to
> __vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
> name remapping is correct in those cases.
>
> You also can't hide security.selinux within user namespaces.  Today
> userspace can get and set security.selinux attributes within user
> namespaces (if allowed by policy), and further can specify the label to
> use for new files via /proc/self/attr/fscreate, which unsurprisingly
> isn't addressed by your changes.  Changing that would be a userspace
> break.

I modified the 1st patch now in such a way that only security.capability 
is rewritten, security.selinux and all other ones remain untouched.

https://github.com/stefanberger/linux/commits/xattr_for_userns.v2

    Stefan

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-23 23:41           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:41 UTC (permalink / raw)
  To: Stephen Smalley, ebiederm, containers
  Cc: lkp, xiaolong.ye, linux-kernel, zohar, serge, tycho,
	James.Bottomley, christian.brauner, vgoyal, amir73il,
	linux-security-module, Paul Moore, selinux

On 06/23/2017 04:30 PM, Stephen Smalley wrote:
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> Before the current modifications, SELinux extended attributes were
>> visible inside the user namespace but changes in patch 1 hid them.
>> This patch enables security.selinux in user namespaces and allows
>> them to be written to in the same way as security.capability.
>>
>> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
>> ---
>>   fs/xattr.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/xattr.c b/fs/xattr.c
>> index 045be85..37686ee 100644
>> --- a/fs/xattr.c
>> +++ b/fs/xattr.c
>> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
>> *name, int mask)
>>    */
>>   static const char *const userns_xattrs[] = {
>>   	XATTR_NAME_CAPS,
>> +	XATTR_NAME_SELINUX,
>>   	NULL
>>   };
>>   
> (cc SELinux maintainers, curiously omitted from these patches)
>
> I don't think this works for SELinux. You don't deal with actually
> supporting multiple security.selinux attributes within SELinux itself
> (and I'm not asking you to do so), and without such support, this can't
> operate as intended. With these patches applied, IIUC, a setxattr() of
> security.selinux within a userns will end up setting only security.seli
> nux@uid=1000 on disk, but will then tell SELinux to update its in-core
> security label to the new value (via security_inode_post_setxattr).
> Meanwhile, on a subsequent getxattr(), you'll call
> security_inode_getsecurity() with the security.selinux@uid=1000 name,
> which will always fail because SELinux doesn't know anything about your
> new scheme, and then you'll call the filesystem handler and returns its
> value, which is no longer connected in any way to the actual label
> being used by SELinux.  Also, SELinux itself makes calls to
> __vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
> name remapping is correct in those cases.
>
> You also can't hide security.selinux within user namespaces.  Today
> userspace can get and set security.selinux attributes within user
> namespaces (if allowed by policy), and further can specify the label to
> use for new files via /proc/self/attr/fscreate, which unsurprisingly
> isn't addressed by your changes.  Changing that would be a userspace
> break.

I modified the 1st patch now in such a way that only security.capability 
is rewritten, security.selinux and all other ones remain untouched.

https://github.com/stefanberger/linux/commits/xattr_for_userns.v2

    Stefan

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-23 23:41           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:41 UTC (permalink / raw)
  To: linux-security-module

On 06/23/2017 04:30 PM, Stephen Smalley wrote:
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> Before the current modifications, SELinux extended attributes were
>> visible inside the user namespace but changes in patch 1 hid them.
>> This patch enables security.selinux in user namespaces and allows
>> them to be written to in the same way as security.capability.
>>
>> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
>> ---
>>   fs/xattr.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/xattr.c b/fs/xattr.c
>> index 045be85..37686ee 100644
>> --- a/fs/xattr.c
>> +++ b/fs/xattr.c
>> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
>> *name, int mask)
>>    */
>>   static const char *const userns_xattrs[] = {
>>   	XATTR_NAME_CAPS,
>> +	XATTR_NAME_SELINUX,
>>   	NULL
>>   };
>>   
> (cc SELinux maintainers, curiously omitted from these patches)
>
> I don't think this works for SELinux. You don't deal with actually
> supporting multiple security.selinux attributes within SELinux itself
> (and I'm not asking you to do so), and without such support, this can't
> operate as intended. With these patches applied, IIUC, a setxattr() of
> security.selinux within a userns will end up setting only security.seli
> nux at uid=1000 on disk, but will then tell SELinux to update its in-core
> security label to the new value (via security_inode_post_setxattr).
> Meanwhile, on a subsequent getxattr(), you'll call
> security_inode_getsecurity() with the security.selinux at uid=1000 name,
> which will always fail because SELinux doesn't know anything about your
> new scheme, and then you'll call the filesystem handler and returns its
> value, which is no longer connected in any way to the actual label
> being used by SELinux.  Also, SELinux itself makes calls to
> __vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
> name remapping is correct in those cases.
>
> You also can't hide security.selinux within user namespaces.  Today
> userspace can get and set security.selinux attributes within user
> namespaces (if allowed by policy), and further can specify the label to
> use for new files via /proc/self/attr/fscreate, which unsurprisingly
> isn't addressed by your changes.  Changing that would be a userspace
> break.

I modified the 1st patch now in such a way that only security.capability 
is rewritten, security.selinux and all other ones remain untouched.

https://github.com/stefanberger/linux/commits/xattr_for_userns.v2

    Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 3/3] Enable security.selinux in user namespaces
@ 2017-06-23 23:41           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-23 23:41 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2588 bytes --]

On 06/23/2017 04:30 PM, Stephen Smalley wrote:
> On Thu, 2017-06-22 at 14:59 -0400, Stefan Berger wrote:
>> Before the current modifications, SELinux extended attributes were
>> visible inside the user namespace but changes in patch 1 hid them.
>> This patch enables security.selinux in user namespaces and allows
>> them to be written to in the same way as security.capability.
>>
>> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
>> ---
>>   fs/xattr.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/xattr.c b/fs/xattr.c
>> index 045be85..37686ee 100644
>> --- a/fs/xattr.c
>> +++ b/fs/xattr.c
>> @@ -138,6 +138,7 @@ xattr_permission(struct inode *inode, const char
>> *name, int mask)
>>    */
>>   static const char *const userns_xattrs[] = {
>>   	XATTR_NAME_CAPS,
>> +	XATTR_NAME_SELINUX,
>>   	NULL
>>   };
>>   
> (cc SELinux maintainers, curiously omitted from these patches)
>
> I don't think this works for SELinux. You don't deal with actually
> supporting multiple security.selinux attributes within SELinux itself
> (and I'm not asking you to do so), and without such support, this can't
> operate as intended. With these patches applied, IIUC, a setxattr() of
> security.selinux within a userns will end up setting only security.seli
> nux(a)uid=1000 on disk, but will then tell SELinux to update its in-core
> security label to the new value (via security_inode_post_setxattr).
> Meanwhile, on a subsequent getxattr(), you'll call
> security_inode_getsecurity() with the security.selinux(a)uid=1000 name,
> which will always fail because SELinux doesn't know anything about your
> new scheme, and then you'll call the filesystem handler and returns its
> value, which is no longer connected in any way to the actual label
> being used by SELinux.  Also, SELinux itself makes calls to
> __vfs_getxattr() and __vfs_setxattr_noperm(), and I don't think your
> name remapping is correct in those cases.
>
> You also can't hide security.selinux within user namespaces.  Today
> userspace can get and set security.selinux attributes within user
> namespaces (if allowed by policy), and further can specify the label to
> use for new files via /proc/self/attr/fscreate, which unsurprisingly
> isn't addressed by your changes.  Changing that would be a userspace
> break.

I modified the 1st patch now in such a way that only security.capability 
is rewritten, security.selinux and all other ones remain untouched.

https://github.com/stefanberger/linux/commits/xattr_for_userns.v2

    Stefan


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23 23:09                   ` Stefan Berger
  (?)
  (?)
@ 2017-06-23 23:51                       ` Casey Schaufler
  -1 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 23:51 UTC (permalink / raw)
  To: Stefan Berger, Serge E. Hallyn
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, lkp-JC7UmRfGjtg

On 6/23/2017 4:09 PM, Stefan Berger wrote:
> On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
>> Quoting Stefan Berger (stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
>>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>>> Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
>>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>>
>>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>>> name when a user namespace is used. If for example the root user
>>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>>> of the xattr that is actually written is encoded as
>>>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>>>> namespace only 'security.capability', with the value of
>>>>>>> security.capability@uid=1000, is visible.
>>>>>>>
>>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>>> to use for this namespace?
>>>>> You're the only one to have mentioned it so far.
>>>>>
>>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>>> prefix exists, the @uid suffix will be parsed.
>>>>>> This could be either  ns.security.capability@uid=1000 or
>>>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>>>> because then we will be able to namespace any xattr without having to
>>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>>> setfattr -n "user.whatever.foo@uid=0"
>>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>>> kernel and then exploit it on a newer one.
>>>> security.ns.capability@uid=1000, then?
>>> Imo, '.ns' is redundant and 'encoded' in the '@'.
>> So how about
>>     security.@uid=1000@@capability ?
> Ouch.
>> Maybe it's not worth it.
>
> So the .ns is there to be able to possibly extend it in another dimension in the future, like have '.foo' there at some point?

Traditionally we have <kind-of-attribute>.<name-of-attribute>
If you want to preserve the kind and name you have to introduce
a third component if you want to have it treated differently
under curtain (e.g. namespaced) conditions. You want to maintain
the kind, because that's already treated specially. You want to
maintain the name because that's what the feature code keys on.
The kind is expected to be first, and the name last, so your new
data needs to be in the middle. You need to identify what you
expect the new bit to be used for because if you're clever enough
to create a reason to add to the attribute name, someone else is,
too. Thus

	security.ns@uid=1000@@.capability
	security.endian@end=big@@.capability

It's better to add fields than to change how a field is
formatted and interpreted.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 23:51                       ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 23:51 UTC (permalink / raw)
  To: Stefan Berger, Serge E. Hallyn
  Cc: Amir Goldstein, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	James Bottomley, christian.brauner, Vivek Goyal, LSM List

On 6/23/2017 4:09 PM, Stefan Berger wrote:
> On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
>> Quoting Stefan Berger (stefanb@linux.vnet.ibm.com):
>>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>>> Quoting Amir Goldstein (amir73il@gmail.com):
>>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>>
>>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>>> name when a user namespace is used. If for example the root user
>>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>>> of the xattr that is actually written is encoded as
>>>>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>>>>> namespace only 'security.capability', with the value of
>>>>>>> security.capability@uid=1000, is visible.
>>>>>>>
>>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>>> to use for this namespace?
>>>>> You're the only one to have mentioned it so far.
>>>>>
>>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>>> prefix exists, the @uid suffix will be parsed.
>>>>>> This could be either  ns.security.capability@uid=1000 or
>>>>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>>>>> because then we will be able to namespace any xattr without having to
>>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>>> setfattr -n "user.whatever.foo@uid=0"
>>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>>> kernel and then exploit it on a newer one.
>>>> security.ns.capability@uid=1000, then?
>>> Imo, '.ns' is redundant and 'encoded' in the '@'.
>> So how about
>>     security.@uid=1000@@capability ?
> Ouch.
>> Maybe it's not worth it.
>
> So the .ns is there to be able to possibly extend it in another dimension in the future, like have '.foo' there at some point?

Traditionally we have <kind-of-attribute>.<name-of-attribute>
If you want to preserve the kind and name you have to introduce
a third component if you want to have it treated differently
under curtain (e.g. namespaced) conditions. You want to maintain
the kind, because that's already treated specially. You want to
maintain the name because that's what the feature code keys on.
The kind is expected to be first, and the name last, so your new
data needs to be in the middle. You need to identify what you
expect the new bit to be used for because if you're clever enough
to create a reason to add to the attribute name, someone else is,
too. Thus

	security.ns@uid=1000@@.capability
	security.endian@end=big@@.capability

It's better to add fields than to change how a field is
formatted and interpreted.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 23:51                       ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 23:51 UTC (permalink / raw)
  To: linux-security-module

On 6/23/2017 4:09 PM, Stefan Berger wrote:
> On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
>> Quoting Stefan Berger (stefanb at linux.vnet.ibm.com):
>>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>>> Quoting Amir Goldstein (amir73il at gmail.com):
>>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>>
>>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>>> name when a user namespace is used. If for example the root user
>>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>>> of the xattr that is actually written is encoded as
>>>>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>>>>> namespace only 'security.capability', with the value of
>>>>>>> security.capability at uid=1000, is visible.
>>>>>>>
>>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>>> to use for this namespace?
>>>>> You're the only one to have mentioned it so far.
>>>>>
>>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>>> prefix exists, the @uid suffix will be parsed.
>>>>>> This could be either  ns.security.capability at uid=1000 or
>>>>>> ns at uid=1000.security.capability. The latter seems more correct to me,
>>>>>> because then we will be able to namespace any xattr without having to
>>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>>> setfattr -n "user.whatever.foo at uid=0"
>>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>>> kernel and then exploit it on a newer one.
>>>> security.ns.capability at uid=1000, then?
>>> Imo, '.ns' is redundant and 'encoded' in the '@'.
>> So how about
>>     security. at uid=1000@@capability ?
> Ouch.
>> Maybe it's not worth it.
>
> So the .ns is there to be able to possibly extend it in another dimension in the future, like have '.foo' there at some point?

Traditionally we have <kind-of-attribute>.<name-of-attribute>
If you want to preserve the kind and name you have to introduce
a third component if you want to have it treated differently
under curtain (e.g. namespaced) conditions. You want to maintain
the kind, because that's already treated specially. You want to
maintain the name because that's what the feature code keys on.
The kind is expected to be first, and the name last, so your new
data needs to be in the middle. You need to identify what you
expect the new bit to be used for because if you're clever enough
to create a reason to add to the attribute name, someone else is,
too. Thus

	security.ns at uid=1000@@.capability
	security.endian at end=big@@.capability

It's better to add fields than to change how a field is
formatted and interpreted.


--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-23 23:51                       ` Casey Schaufler
  0 siblings, 0 replies; 180+ messages in thread
From: Casey Schaufler @ 2017-06-23 23:51 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3551 bytes --]

On 6/23/2017 4:09 PM, Stefan Berger wrote:
> On 06/23/2017 02:35 PM, Serge E. Hallyn wrote:
>> Quoting Stefan Berger (stefanb(a)linux.vnet.ibm.com):
>>> On 06/23/2017 12:16 PM, Casey Schaufler wrote:
>>>> On 6/23/2017 9:00 AM, Serge E. Hallyn wrote:
>>>>> Quoting Amir Goldstein (amir73il(a)gmail.com):
>>>>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>>>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>>>>> This series of patches primary goal is to enable file capabilities
>>>>>>> in user namespaces without affecting the file capabilities that are
>>>>>>> effective on the host. This is to prevent that any unprivileged user
>>>>>>> on the host maps his own uid to root in a private namespace, writes
>>>>>>> the xattr, and executes the file with privilege on the host.
>>>>>>>
>>>>>>> We achieve this goal by writing extended attributes with a different
>>>>>>> name when a user namespace is used. If for example the root user
>>>>>>> in a user namespace writes the security.capability xattr, the name
>>>>>>> of the xattr that is actually written is encoded as
>>>>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>>>>> When listing the xattrs on the host, the existing security.capability
>>>>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>>>>> namespace only 'security.capability', with the value of
>>>>>>> security.capability(a)uid=1000, is visible.
>>>>>>>
>>>>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>>>>> to use for this namespace?
>>>>> You're the only one to have mentioned it so far.
>>>>>
>>>>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>>>>> it that way - define a new special xattr namespace "ns" and only if that
>>>>>> prefix exists, the @uid suffix will be parsed.
>>>>>> This could be either  ns.security.capability(a)uid=1000 or
>>>>>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>>>>>> because then we will be able to namespace any xattr without having to
>>>>>> protect from "unprivileged xattr injection", i.e.:
>>>>>> setfattr -n "user.whatever.foo(a)uid=0"
>>>>> I like it for simplifying the parser code.  One concern I have is that,
>>>>> since ns.* is currently not gated, one could write ns.* on an older
>>>>> kernel and then exploit it on a newer one.
>>>> security.ns.capability(a)uid=1000, then?
>>> Imo, '.ns' is redundant and 'encoded' in the '@'.
>> So how about
>>     security.(a)uid=1000@@capability ?
> Ouch.
>> Maybe it's not worth it.
>
> So the .ns is there to be able to possibly extend it in another dimension in the future, like have '.foo' there at some point?

Traditionally we have <kind-of-attribute>.<name-of-attribute>
If you want to preserve the kind and name you have to introduce
a third component if you want to have it treated differently
under curtain (e.g. namespaced) conditions. You want to maintain
the kind, because that's already treated specially. You want to
maintain the name because that's what the feature code keys on.
The kind is expected to be first, and the name last, so your new
data needs to be in the middle. You need to identify what you
expect the new bit to be used for because if you're clever enough
to create a reason to add to the attribute name, someone else is,
too. Thus

	security.ns(a)uid=1000@@.capability
	security.endian(a)end=big@@.capability

It's better to add fields than to change how a field is
formatted and interpreted.



^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 1/3] xattr: Enable security.capability in user namespaces
       [not found]   ` <1498157989-11814-2-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2017-06-24 21:02     ` [PATCH] xattr: fix kstrdup.cocci warnings kbuild test robot
@ 2017-06-24 21:02     ` kbuild test robot
  1 sibling, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: Stefan Berger
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	kbuild-all-JC7UmRfGjtg, lkp-JC7UmRfGjtg,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w

Hi Stefan,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Stefan-Berger/Enable-namespaced-file-capabilities/20170625-001722


coccinelle warnings: (new ones prefixed by >>)

>> fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH] xattr: fix kstrdup.cocci warnings
       [not found]   ` <1498157989-11814-2-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2017-06-24 21:02     ` kbuild test robot
  2017-06-24 21:02     ` [PATCH 1/3] xattr: Enable security.capability in user namespaces kbuild test robot
  1 sibling, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: Stefan Berger
  Cc: zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	kbuild-all-JC7UmRfGjtg, lkp-JC7UmRfGjtg,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w

fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

 Use kstrdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/kstrdup.cocci

CC: Stefan Berger <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Fengguang Wu <fengguang.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---

 xattr.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -513,10 +513,9 @@ xattr_userns_name(const char *fullname,
 	return buffer;
 
 out_copy:
-	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	buffer = kstrdup(suffix, GFP_KERNEL);
 	if (!buffer)
 		return ERR_PTR(-ENOMEM);
-	strcpy(buffer, suffix);
 
 	return buffer;

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 1/3] xattr: Enable security.capability in user namespaces
  2017-06-22 18:59   ` Stefan Berger
  (?)
@ 2017-06-24 21:02     ` kbuild test robot
  -1 siblings, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: Stefan Berger
  Cc: kbuild-all, ebiederm, containers, lkp, xiaolong.ye, linux-kernel,
	zohar, serge, tycho, James.Bottomley, christian.brauner, stefanb,
	vgoyal, amir73il, linux-security-module

Hi Stefan,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Stefan-Berger/Enable-namespaced-file-capabilities/20170625-001722


coccinelle warnings: (new ones prefixed by >>)

>> fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH] xattr: fix kstrdup.cocci warnings
  2017-06-22 18:59   ` Stefan Berger
  (?)
@ 2017-06-24 21:02     ` kbuild test robot
  -1 siblings, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: Stefan Berger
  Cc: kbuild-all, ebiederm, containers, lkp, xiaolong.ye, linux-kernel,
	zohar, serge, tycho, James.Bottomley, christian.brauner, stefanb,
	vgoyal, amir73il, linux-security-module

fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

 Use kstrdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/kstrdup.cocci

CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 xattr.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -513,10 +513,9 @@ xattr_userns_name(const char *fullname,
 	return buffer;
 
 out_copy:
-	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	buffer = kstrdup(suffix, GFP_KERNEL);
 	if (!buffer)
 		return ERR_PTR(-ENOMEM);
-	strcpy(buffer, suffix);
 
 	return buffer;
 

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 1/3] xattr: Enable security.capability in user namespaces
@ 2017-06-24 21:02     ` kbuild test robot
  0 siblings, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: linux-security-module

Hi Stefan,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Stefan-Berger/Enable-namespaced-file-capabilities/20170625-001722


coccinelle warnings: (new ones prefixed by >>)

>> fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH] xattr: fix kstrdup.cocci warnings
@ 2017-06-24 21:02     ` kbuild test robot
  0 siblings, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: linux-security-module

fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

 Use kstrdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/kstrdup.cocci

CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 xattr.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -513,10 +513,9 @@ xattr_userns_name(const char *fullname,
 	return buffer;
 
 out_copy:
-	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	buffer = kstrdup(suffix, GFP_KERNEL);
 	if (!buffer)
 		return ERR_PTR(-ENOMEM);
-	strcpy(buffer, suffix);
 
 	return buffer;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 1/3] xattr: Enable security.capability in user namespaces
@ 2017-06-24 21:02     ` kbuild test robot
  0 siblings, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 682 bytes --]

Hi Stefan,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Stefan-Berger/Enable-namespaced-file-capabilities/20170625-001722


coccinelle warnings: (new ones prefixed by >>)

>> fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH] xattr: fix kstrdup.cocci warnings
@ 2017-06-24 21:02     ` kbuild test robot
  0 siblings, 0 replies; 180+ messages in thread
From: kbuild test robot @ 2017-06-24 21:02 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 693 bytes --]

fs/xattr.c:516:10-17: WARNING opportunity for kstrdep (strcpy on line 519)

 Use kstrdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/kstrdup.cocci

CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 xattr.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -513,10 +513,9 @@ xattr_userns_name(const char *fullname,
 	return buffer;
 
 out_copy:
-	buffer = kmalloc(strlen(suffix) + 1, GFP_KERNEL);
+	buffer = kstrdup(suffix, GFP_KERNEL);
 	if (!buffer)
 		return ERR_PTR(-ENOMEM);
-	strcpy(buffer, suffix);
 
 	return buffer;
 

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                       ` <553a72c4-eda9-52d6-2ae2-f8687c0c7c70-l3A5Bk7waGM@public.gmane.org>
  2017-06-23 18:30                         ` Serge E. Hallyn
@ 2017-06-25 12:35                         ` Eric W. Biederman
       [not found]                           ` <87lgogdz2t.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-25 12:35 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Aleksa Sarai <asarai-l3A5Bk7waGM@public.gmane.org> writes:

>>> So my essential point is that building the real kuid into the permanent
>>> record of the xattr damages image portability, which is touted as one
>>> of the real advantages of container images.
>>
>> 'container images' aren't portable in that sense now - for at least
>> many cases - because you have to shift the uid.  However you're doing
>> that, you may be able to shift the xattr the same way.
>
> Piling more things on top of that issue isn't going to make the issue easier to
> solve IMO. Would shiftfs or shift-bindmounts also have to do translation of
> arbitrary xattrs? Plus I would think that handling xattrs would be harder than
> {u,g}ids because the image unpacker now has to be aware of all xattrs that
> require remapping (Which might be an ever-growing list).
>
> The user namespace incompatibility with the VFS's hard-coding of k{u,g}id values
> in inodes is an issue that we really shouldn't be encouraging IMO [especially
> given how hard it's been so far to solve that problem.]

There is one very simple solution to the problem.

Perform the unpacking in your user namespace.

The reason Docker doesn't do that is they want to share files and images
between different containers.  That sharing when we are talking about
different privilege domains and persistent storage is a challenge.
Hopefully shiftfs can solve that challenge.

When you stop trying to share between privilege domain this is all quite
trivial.  This is not an incompatibility of the VFS.


Eric

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                           ` <87lgogdz2t.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-06-25 13:06                             ` Aleksa Sarai
       [not found]                               ` <f1716e8c-dba8-a051-6bc7-285f13ffcaf0-l3A5Bk7waGM@public.gmane.org>
  0 siblings, 1 reply; 180+ messages in thread
From: Aleksa Sarai @ 2017-06-25 13:06 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

>>>> So my essential point is that building the real kuid into the permanent
>>>> record of the xattr damages image portability, which is touted as one
>>>> of the real advantages of container images.
>>>
>>> 'container images' aren't portable in that sense now - for at least
>>> many cases - because you have to shift the uid.  However you're doing
>>> that, you may be able to shift the xattr the same way.
>>
>> Piling more things on top of that issue isn't going to make the issue easier to
>> solve IMO. Would shiftfs or shift-bindmounts also have to do translation of
>> arbitrary xattrs? Plus I would think that handling xattrs would be harder than
>> {u,g}ids because the image unpacker now has to be aware of all xattrs that
>> require remapping (Which might be an ever-growing list).
>>
>> The user namespace incompatibility with the VFS's hard-coding of k{u,g}id values
>> in inodes is an issue that we really shouldn't be encouraging IMO [especially
>> given how hard it's been so far to solve that problem.]
> 
> There is one very simple solution to the problem.
> 
> Perform the unpacking in your user namespace.

I'm not aware of any major container runtime that couples image 
unpacking to the runtime components. Docker hasn't done it for years 
(it's split between runc and Docker/containerd). rkt hasn't ever done it 
(runtime stages are totally separate to image unpacking). cri-o doesn't 
do it either. I believe that only singularity does something like that 
(though singularity is also not actually a "container runtime" in the 
modern meaning of the term).

Not to mention that the OCI standards explicitly separate the two 
concepts, and there exist tools to manipulate images that don't 
explicitly use containers (or namespaces for that matter) either[1].

> The reason Docker doesn't do that is they want to share files and images
> between different containers.  That sharing when we are talking about
> different privilege domains and persistent storage is a challenge.
> Hopefully shiftfs can solve that challenge.

Yes, I'm aware of that -- though claiming it's purely a Docker problem 
isn't really fair (it's a problem of any container runtime that wants to 
effectively use overlay filesystems). If shiftfs is going to solve the 
sharing problem for xattrs as well, then I don't have any complaints 
other than "it sucks that we have to add more magical translation to a 
still-not-merged shiftfs".

But if you say there's not a nicer way to handle this problem, then 
that's good enough for me. :D

[1]: https://github.com/openSUSE/umoci

-- 
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                               ` <f1716e8c-dba8-a051-6bc7-285f13ffcaf0-l3A5Bk7waGM@public.gmane.org>
@ 2017-06-25 13:28                                 ` Eric W. Biederman
       [not found]                                   ` <87zicwb3hu.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 180+ messages in thread
From: Eric W. Biederman @ 2017-06-25 13:28 UTC (permalink / raw)
  To: Aleksa Sarai; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Aleksa Sarai <asarai-l3A5Bk7waGM@public.gmane.org> writes:

>>>>> So my essential point is that building the real kuid into the permanent
>>>>> record of the xattr damages image portability, which is touted as one
>>>>> of the real advantages of container images.
>>>>
>>>> 'container images' aren't portable in that sense now - for at least
>>>> many cases - because you have to shift the uid.  However you're doing
>>>> that, you may be able to shift the xattr the same way.
>>>
>>> Piling more things on top of that issue isn't going to make the issue easier to
>>> solve IMO. Would shiftfs or shift-bindmounts also have to do translation of
>>> arbitrary xattrs? Plus I would think that handling xattrs would be harder than
>>> {u,g}ids because the image unpacker now has to be aware of all xattrs that
>>> require remapping (Which might be an ever-growing list).
>>>
>>> The user namespace incompatibility with the VFS's hard-coding of k{u,g}id values
>>> in inodes is an issue that we really shouldn't be encouraging IMO [especially
>>> given how hard it's been so far to solve that problem.]
>>
>> There is one very simple solution to the problem.
>>
>> Perform the unpacking in your user namespace.
>
> I'm not aware of any major container runtime that couples image
> unpacking to the runtime components. Docker hasn't done it for years
> (it's split between runc and Docker/containerd). rkt hasn't ever done
> it (runtime stages are totally separate to image unpacking). cri-o
> doesn't do it either. I believe that only singularity does something
> like that (though singularity is also not actually a "container
> runtime" in the modern meaning of the term).
>
> Not to mention that the OCI standards explicitly separate the two
> concepts, and there exist tools to manipulate images that don't
> explicitly use containers (or namespaces for that matter) either[1].

It doesn't require coupling it just requires knowing which uids and
gids (from the filesystem perspective) your images are going to use
when you unpack them.

Fundamentally these must be persistent assignments as the uids and gids
persist on disk.  Anything else is not safe.

Knowing those persistent assignments you can unpack an image.  And you
can setup an appropriate user namespace for the unpacking.

And all of that can work as an unprivileged user if done carefully.
Which should reduce the possibility of danger.


>> The reason Docker doesn't do that is they want to share files and images
>> between different containers.  That sharing when we are talking about
>> different privilege domains and persistent storage is a challenge.
>> Hopefully shiftfs can solve that challenge.
>
> Yes, I'm aware of that -- though claiming it's purely a Docker problem
> isn't really fair (it's a problem of any container runtime that wants
> to effectively use overlay filesystems). If shiftfs is going to solve
> the sharing problem for xattrs as well, then I don't have any
> complaints other than "it sucks that we have to add more magical
> translation to a still-not-merged shiftfs".

I have only had discussions with the Docker developers on this.

If you choose not to share a filesystem image between containers or
find it ok to have uids and gids that are in a range that they can be
shared between containers (something that makes a container look
different than a bare metal machine) there are solutions without
shiftfs.

That is if you can allow some arbitrary user to own all of the shared
files in an image.  You can mount the image read-only, and share it
with multiple running containers with different uid and gid mappings.

> But if you say there's not a nicer way to handle this problem, then
> that's good enough for me. :D

It really depends on your design assumptions.  My point is if your
assumption is that if the filesystem can't look ``containerized'' in the
ownership of files you run into trouble.

Last time we had the discussion the Docker folks prioritized sharing
files between different containers with different administrative domains
over running containers more in a more separated fashion.

If you choose to use more storage when running multiple containers that
limitation is not an issue.

Eric

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                                   ` <87zicwb3hu.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2017-06-25 13:51                                     ` Aleksa Sarai
       [not found]                                       ` <5bef361a-bc31-f3bc-f513-e728a48f0524-l3A5Bk7waGM@public.gmane.org>
  0 siblings, 1 reply; 180+ messages in thread
From: Aleksa Sarai @ 2017-06-25 13:51 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

>>>>>> So my essential point is that building the real kuid into the permanent
>>>>>> record of the xattr damages image portability, which is touted as one
>>>>>> of the real advantages of container images.
>>>>>
>>>>> 'container images' aren't portable in that sense now - for at least
>>>>> many cases - because you have to shift the uid.  However you're doing
>>>>> that, you may be able to shift the xattr the same way.
>>>>
>>>> Piling more things on top of that issue isn't going to make the issue easier to
>>>> solve IMO. Would shiftfs or shift-bindmounts also have to do translation of
>>>> arbitrary xattrs? Plus I would think that handling xattrs would be harder than
>>>> {u,g}ids because the image unpacker now has to be aware of all xattrs that
>>>> require remapping (Which might be an ever-growing list).
>>>>
>>>> The user namespace incompatibility with the VFS's hard-coding of k{u,g}id values
>>>> in inodes is an issue that we really shouldn't be encouraging IMO [especially
>>>> given how hard it's been so far to solve that problem.]
>>>
>>> There is one very simple solution to the problem.
>>>
>>> Perform the unpacking in your user namespace.
>>
>> I'm not aware of any major container runtime that couples image
>> unpacking to the runtime components. Docker hasn't done it for years
>> (it's split between runc and Docker/containerd). rkt hasn't ever done
>> it (runtime stages are totally separate to image unpacking). cri-o
>> doesn't do it either. I believe that only singularity does something
>> like that (though singularity is also not actually a "container
>> runtime" in the modern meaning of the term).
>>
>> Not to mention that the OCI standards explicitly separate the two
>> concepts, and there exist tools to manipulate images that don't
>> explicitly use containers (or namespaces for that matter) either[1].
> 
> It doesn't require coupling it just requires knowing which uids and
> gids (from the filesystem perspective) your images are going to use
> when you unpack them.

Yeah, I assumed that would also work. I was just responding to "perform 
the unpacking in your user namespace" and was just clarifying that 
currently no container runtime would want to do that.

> And all of that can work as an unprivileged user if done carefully.
> Which should reduce the possibility of danger.

:+1: I'm definitely going to be using this within umoci's rootless 
containers support as well as anywhere else it's appropriate.

Cheers.

-- 
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                                       ` <5bef361a-bc31-f3bc-f513-e728a48f0524-l3A5Bk7waGM@public.gmane.org>
@ 2017-06-25 16:45                                         ` Serge E. Hallyn
       [not found]                                           ` <20170625164558.GA24471-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
  0 siblings, 1 reply; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-25 16:45 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eric W. Biederman

Quoting Aleksa Sarai (asarai-l3A5Bk7waGM@public.gmane.org):
> >>>>>>So my essential point is that building the real kuid into the permanent
> >>>>>>record of the xattr damages image portability, which is touted as one
> >>>>>>of the real advantages of container images.
> >>>>>
> >>>>>'container images' aren't portable in that sense now - for at least
> >>>>>many cases - because you have to shift the uid.  However you're doing
> >>>>>that, you may be able to shift the xattr the same way.
> >>>>
> >>>>Piling more things on top of that issue isn't going to make the issue easier to
> >>>>solve IMO. Would shiftfs or shift-bindmounts also have to do translation of
> >>>>arbitrary xattrs? Plus I would think that handling xattrs would be harder than
> >>>>{u,g}ids because the image unpacker now has to be aware of all xattrs that
> >>>>require remapping (Which might be an ever-growing list).
> >>>>
> >>>>The user namespace incompatibility with the VFS's hard-coding of k{u,g}id values
> >>>>in inodes is an issue that we really shouldn't be encouraging IMO [especially
> >>>>given how hard it's been so far to solve that problem.]
> >>>
> >>>There is one very simple solution to the problem.
> >>>
> >>>Perform the unpacking in your user namespace.
> >>
> >>I'm not aware of any major container runtime that couples image
> >>unpacking to the runtime components. Docker hasn't done it for years
> >>(it's split between runc and Docker/containerd). rkt hasn't ever done
> >>it (runtime stages are totally separate to image unpacking). cri-o
> >>doesn't do it either. I believe that only singularity does something
> >>like that (though singularity is also not actually a "container
> >>runtime" in the modern meaning of the term).
> >>
> >>Not to mention that the OCI standards explicitly separate the two
> >>concepts, and there exist tools to manipulate images that don't
> >>explicitly use containers (or namespaces for that matter) either[1].
> >
> >It doesn't require coupling it just requires knowing which uids and
> >gids (from the filesystem perspective) your images are going to use
> >when you unpack them.
> 
> Yeah, I assumed that would also work. I was just responding to
> "perform the unpacking in your user namespace" and was just
> clarifying that currently no container runtime would want to do
> that.

That's exactly what lxc does.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]                                           ` <20170625164558.GA24471-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-26  6:14                                             ` Aleksa Sarai
  0 siblings, 0 replies; 180+ messages in thread
From: Aleksa Sarai @ 2017-06-26  6:14 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eric W. Biederman

>>>>> There is one very simple solution to the problem.
>>>>>
>>>>> Perform the unpacking in your user namespace.
>>>>
>>>> I'm not aware of any major container runtime that couples image
>>>> unpacking to the runtime components >> Yeah, I assumed that would also work. I was just responding to
>> "perform the unpacking in your user namespace" and was just
>> clarifying that currently no container runtime would want to do
>> that.
> 
> That's exactly what lxc does.

Oh really, I wasn't aware of that. Thanks -- I haven't really looked at 
LXC's templates before, I should take a look at that over the weekend.

-- 
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23  7:01     ` Amir Goldstein
                       ` (3 preceding siblings ...)
  (?)
@ 2017-06-28  5:41     ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28  5:41 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, lkp-JC7UmRfGjtg

On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> >
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> >
> 
> Am I the only one who thinks that suffix is perhaps not the best grammar
> to use for this namespace?
> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> it that way - define a new special xattr namespace "ns" and only if that
> prefix exists, the @uid suffix will be parsed.
> This could be either  ns.security.capability@uid=1000 or
> ns@uid=1000.security.capability. The latter seems more correct to me,
> because then we will be able to namespace any xattr without having to
> protect from "unprivileged xattr injection", i.e.:
> setfattr -n "user.whatever.foo@uid=0"
> 
> Amir.

Hi Amir,

I was liking the prefix at first, but I'm actually not sure it's worth
it.  THe main advantage would be so that checking for namespace or other
tags could be done always at the same offset simplifying the parser.
But since we will want to only handle namespacing for some tags, and
potentially differently for each task, it won't actually be simpler, I
don't think.

On the other hand we do want to make sure that the syntax we use is
generally usable, so I think simply specifying that >1 tags can each
be separate by '@' should suffice.  So for now we'd only have

	security.capability@uid=100000

soon we'd hopefully have

	security.ima@uid=100000

and eventually trusted.blarb@foo=bar

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-23  7:01     ` Amir Goldstein
@ 2017-06-28  5:41       ` Serge E. Hallyn
  -1 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28  5:41 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Stefan Berger, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Serge E. Hallyn,
	Tycho Andersen, James Bottomley, christian.brauner, Vivek Goyal,
	LSM List

On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> <stefanb@linux.vnet.ibm.com> wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> >
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability@uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability@uid=1000, is visible.
> >
> 
> Am I the only one who thinks that suffix is perhaps not the best grammar
> to use for this namespace?
> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> it that way - define a new special xattr namespace "ns" and only if that
> prefix exists, the @uid suffix will be parsed.
> This could be either  ns.security.capability@uid=1000 or
> ns@uid=1000.security.capability. The latter seems more correct to me,
> because then we will be able to namespace any xattr without having to
> protect from "unprivileged xattr injection", i.e.:
> setfattr -n "user.whatever.foo@uid=0"
> 
> Amir.

Hi Amir,

I was liking the prefix at first, but I'm actually not sure it's worth
it.  THe main advantage would be so that checking for namespace or other
tags could be done always at the same offset simplifying the parser.
But since we will want to only handle namespacing for some tags, and
potentially differently for each task, it won't actually be simpler, I
don't think.

On the other hand we do want to make sure that the syntax we use is
generally usable, so I think simply specifying that >1 tags can each
be separate by '@' should suffice.  So for now we'd only have

	security.capability@uid=100000

soon we'd hopefully have

	security.ima@uid=100000

and eventually trusted.blarb@foo=bar

-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-28  5:41       ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28  5:41 UTC (permalink / raw)
  To: linux-security-module

On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
> <stefanb@linux.vnet.ibm.com> wrote:
> > This series of patches primary goal is to enable file capabilities
> > in user namespaces without affecting the file capabilities that are
> > effective on the host. This is to prevent that any unprivileged user
> > on the host maps his own uid to root in a private namespace, writes
> > the xattr, and executes the file with privilege on the host.
> >
> > We achieve this goal by writing extended attributes with a different
> > name when a user namespace is used. If for example the root user
> > in a user namespace writes the security.capability xattr, the name
> > of the xattr that is actually written is encoded as
> > security.capability at uid=1000 for root mapped to uid 1000 on the host.
> > When listing the xattrs on the host, the existing security.capability
> > as well as the security.capability at uid=1000 will be shown. Inside the
> > namespace only 'security.capability', with the value of
> > security.capability at uid=1000, is visible.
> >
> 
> Am I the only one who thinks that suffix is perhaps not the best grammar
> to use for this namespace?
> xattrs are clearly namespaced by prefix, so it seems right to me to keep
> it that way - define a new special xattr namespace "ns" and only if that
> prefix exists, the @uid suffix will be parsed.
> This could be either  ns.security.capability at uid=1000 or
> ns at uid=1000.security.capability. The latter seems more correct to me,
> because then we will be able to namespace any xattr without having to
> protect from "unprivileged xattr injection", i.e.:
> setfattr -n "user.whatever.foo at uid=0"
> 
> Amir.

Hi Amir,

I was liking the prefix at first, but I'm actually not sure it's worth
it.  THe main advantage would be so that checking for namespace or other
tags could be done always at the same offset simplifying the parser.
But since we will want to only handle namespacing for some tags, and
potentially differently for each task, it won't actually be simpler, I
don't think.

On the other hand we do want to make sure that the syntax we use is
generally usable, so I think simply specifying that >1 tags can each
be separate by '@' should suffice.  So for now we'd only have

	security.capability at uid=100000

soon we'd hopefully have

	security.ima at uid=100000

and eventually trusted.blarb at foo=bar

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]       ` <20170628054138.GA15939-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2017-06-28  7:18         ` Amir Goldstein
  0 siblings, 0 replies; 180+ messages in thread
From: Amir Goldstein @ 2017-06-28  7:18 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Mimi Zohar, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, Casey Schaufler, lkp-JC7UmRfGjtg

On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>> > This series of patches primary goal is to enable file capabilities
>> > in user namespaces without affecting the file capabilities that are
>> > effective on the host. This is to prevent that any unprivileged user
>> > on the host maps his own uid to root in a private namespace, writes
>> > the xattr, and executes the file with privilege on the host.
>> >
>> > We achieve this goal by writing extended attributes with a different
>> > name when a user namespace is used. If for example the root user
>> > in a user namespace writes the security.capability xattr, the name
>> > of the xattr that is actually written is encoded as
>> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
>> > When listing the xattrs on the host, the existing security.capability
>> > as well as the security.capability@uid=1000 will be shown. Inside the
>> > namespace only 'security.capability', with the value of
>> > security.capability@uid=1000, is visible.
>> >
>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability@uid=1000 or
>> ns@uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo@uid=0"
>>
>> Amir.
>
> Hi Amir,
>
> I was liking the prefix at first, but I'm actually not sure it's worth
> it.  THe main advantage would be so that checking for namespace or other
> tags could be done always at the same offset simplifying the parser.
> But since we will want to only handle namespacing for some tags, and
> potentially differently for each task, it won't actually be simpler, I
> don't think.
>
> On the other hand we do want to make sure that the syntax we use is
> generally usable, so I think simply specifying that >1 tags can each
> be separate by '@' should suffice.  So for now we'd only have

Serge,

I am not sure I am parsing what you are saying correctly (pun intended).
Can you give some examples of xattr names with several @.

>
>         security.capability@uid=100000
>
> soon we'd hopefully have
>
>         security.ima@uid=100000
>

IIUC, the xattr names above should be parsed as:

        security.(([ima|capability])@(uid=100000)

> and eventually trusted.blarb@foo=bar
>

But the trusted xattr name should be parsed as:

        (trusted.blarb)@(uid=100000)

Otherwise it won't be able to pass the xattr_is_trusted() test
which looks only at the trusted prefix.

So we can write it like this, if it makes sense for the parser:
        trusted@uid=100000.blarb

But I don't think that trusted.foo should have a different
userns behavior than trusted.bar down the road.

Admittedly, I am not so much of a security developer myself,
so I prefer to let Casey be the spokesman for the '.ns' prefix.
Casey's proposal seems right to me:

        security.ns@uid=1000@@.capability

We can also stick to a more conventional syntax of a perfect
new namespace 'security.ns', which encapsulates the unprivileged
xattr name completely. This should suffice perfectly for the current
capability V3 needs and is flexible enough to be extended later:

        security.ns.user.1000.security.capability
OR:
        security.ns@uid=1000@@.security.capability

And going forward, just as easy:

        security.ns.user.1000.[trusted|system|user].foo

Amir.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-28  5:41       ` Serge E. Hallyn
@ 2017-06-28  7:18         ` Amir Goldstein
  -1 siblings, 0 replies; 180+ messages in thread
From: Amir Goldstein @ 2017-06-28  7:18 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Stefan Berger, Eric W. Biederman, Linux Containers, lkp,
	xiaolong.ye, linux-kernel, Mimi Zohar, Tycho Andersen,
	James Bottomley, christian.brauner, Vivek Goyal, LSM List,
	Casey Schaufler

On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb@linux.vnet.ibm.com> wrote:
>> > This series of patches primary goal is to enable file capabilities
>> > in user namespaces without affecting the file capabilities that are
>> > effective on the host. This is to prevent that any unprivileged user
>> > on the host maps his own uid to root in a private namespace, writes
>> > the xattr, and executes the file with privilege on the host.
>> >
>> > We achieve this goal by writing extended attributes with a different
>> > name when a user namespace is used. If for example the root user
>> > in a user namespace writes the security.capability xattr, the name
>> > of the xattr that is actually written is encoded as
>> > security.capability@uid=1000 for root mapped to uid 1000 on the host.
>> > When listing the xattrs on the host, the existing security.capability
>> > as well as the security.capability@uid=1000 will be shown. Inside the
>> > namespace only 'security.capability', with the value of
>> > security.capability@uid=1000, is visible.
>> >
>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability@uid=1000 or
>> ns@uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo@uid=0"
>>
>> Amir.
>
> Hi Amir,
>
> I was liking the prefix at first, but I'm actually not sure it's worth
> it.  THe main advantage would be so that checking for namespace or other
> tags could be done always at the same offset simplifying the parser.
> But since we will want to only handle namespacing for some tags, and
> potentially differently for each task, it won't actually be simpler, I
> don't think.
>
> On the other hand we do want to make sure that the syntax we use is
> generally usable, so I think simply specifying that >1 tags can each
> be separate by '@' should suffice.  So for now we'd only have

Serge,

I am not sure I am parsing what you are saying correctly (pun intended).
Can you give some examples of xattr names with several @.

>
>         security.capability@uid=100000
>
> soon we'd hopefully have
>
>         security.ima@uid=100000
>

IIUC, the xattr names above should be parsed as:

        security.(([ima|capability])@(uid=100000)

> and eventually trusted.blarb@foo=bar
>

But the trusted xattr name should be parsed as:

        (trusted.blarb)@(uid=100000)

Otherwise it won't be able to pass the xattr_is_trusted() test
which looks only at the trusted prefix.

So we can write it like this, if it makes sense for the parser:
        trusted@uid=100000.blarb

But I don't think that trusted.foo should have a different
userns behavior than trusted.bar down the road.

Admittedly, I am not so much of a security developer myself,
so I prefer to let Casey be the spokesman for the '.ns' prefix.
Casey's proposal seems right to me:

        security.ns@uid=1000@@.capability

We can also stick to a more conventional syntax of a perfect
new namespace 'security.ns', which encapsulates the unprivileged
xattr name completely. This should suffice perfectly for the current
capability V3 needs and is flexible enough to be extended later:

        security.ns.user.1000.security.capability
OR:
        security.ns@uid=1000@@.security.capability

And going forward, just as easy:

        security.ns.user.1000.[trusted|system|user].foo

Amir.

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-28  7:18         ` Amir Goldstein
  0 siblings, 0 replies; 180+ messages in thread
From: Amir Goldstein @ 2017-06-28  7:18 UTC (permalink / raw)
  To: linux-security-module

On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>> <stefanb@linux.vnet.ibm.com> wrote:
>> > This series of patches primary goal is to enable file capabilities
>> > in user namespaces without affecting the file capabilities that are
>> > effective on the host. This is to prevent that any unprivileged user
>> > on the host maps his own uid to root in a private namespace, writes
>> > the xattr, and executes the file with privilege on the host.
>> >
>> > We achieve this goal by writing extended attributes with a different
>> > name when a user namespace is used. If for example the root user
>> > in a user namespace writes the security.capability xattr, the name
>> > of the xattr that is actually written is encoded as
>> > security.capability at uid=1000 for root mapped to uid 1000 on the host.
>> > When listing the xattrs on the host, the existing security.capability
>> > as well as the security.capability at uid=1000 will be shown. Inside the
>> > namespace only 'security.capability', with the value of
>> > security.capability at uid=1000, is visible.
>> >
>>
>> Am I the only one who thinks that suffix is perhaps not the best grammar
>> to use for this namespace?
>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>> it that way - define a new special xattr namespace "ns" and only if that
>> prefix exists, the @uid suffix will be parsed.
>> This could be either  ns.security.capability at uid=1000 or
>> ns at uid=1000.security.capability. The latter seems more correct to me,
>> because then we will be able to namespace any xattr without having to
>> protect from "unprivileged xattr injection", i.e.:
>> setfattr -n "user.whatever.foo at uid=0"
>>
>> Amir.
>
> Hi Amir,
>
> I was liking the prefix at first, but I'm actually not sure it's worth
> it.  THe main advantage would be so that checking for namespace or other
> tags could be done always at the same offset simplifying the parser.
> But since we will want to only handle namespacing for some tags, and
> potentially differently for each task, it won't actually be simpler, I
> don't think.
>
> On the other hand we do want to make sure that the syntax we use is
> generally usable, so I think simply specifying that >1 tags can each
> be separate by '@' should suffice.  So for now we'd only have

Serge,

I am not sure I am parsing what you are saying correctly (pun intended).
Can you give some examples of xattr names with several @.

>
>         security.capability at uid=100000
>
> soon we'd hopefully have
>
>         security.ima at uid=100000
>

IIUC, the xattr names above should be parsed as:

        security.(([ima|capability])@(uid=100000)

> and eventually trusted.blarb at foo=bar
>

But the trusted xattr name should be parsed as:

        (trusted.blarb)@(uid=100000)

Otherwise it won't be able to pass the xattr_is_trusted() test
which looks only at the trusted prefix.

So we can write it like this, if it makes sense for the parser:
        trusted at uid=100000.blarb

But I don't think that trusted.foo should have a different
userns behavior than trusted.bar down the road.

Admittedly, I am not so much of a security developer myself,
so I prefer to let Casey be the spokesman for the '.ns' prefix.
Casey's proposal seems right to me:

        security.ns at uid=1000@@.capability

We can also stick to a more conventional syntax of a perfect
new namespace 'security.ns', which encapsulates the unprivileged
xattr name completely. This should suffice perfectly for the current
capability V3 needs and is flexible enough to be extended later:

        security.ns.user.1000.security.capability
OR:
        security.ns at uid=1000@@.security.capability

And going forward, just as easy:

        security.ns.user.1000.[trusted|system|user].foo

Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]         ` <CAOQ4uxhiSHEXzWN7=g-nmu=ebpv7hkXszW03JZ4UJkcjTeH+oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-28 14:04           ` Stefan Berger
  2017-06-28 14:28           ` Serge E. Hallyn
  1 sibling, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-28 14:04 UTC (permalink / raw)
  To: Amir Goldstein, Serge E. Hallyn
  Cc: lkp-JC7UmRfGjtg, Linux Containers, linux-kernel,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, James Bottomley, LSM List,
	Eric W. Biederman, Casey Schaufler, Mimi Zohar

On 06/28/2017 03:18 AM, Amir Goldstein wrote:
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote:
>> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability@uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability@uid=1000 or
>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo@uid=0"
>>>
>>> Amir.
>> Hi Amir,
>>
>> I was liking the prefix at first, but I'm actually not sure it's worth
>> it.  THe main advantage would be so that checking for namespace or other
>> tags could be done always at the same offset simplifying the parser.
>> But since we will want to only handle namespacing for some tags, and
>> potentially differently for each task, it won't actually be simpler, I
>> don't think.
>>
>> On the other hand we do want to make sure that the syntax we use is
>> generally usable, so I think simply specifying that >1 tags can each
>> be separate by '@' should suffice.  So for now we'd only have
> Serge,
>
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
>
>>          security.capability@uid=100000
>>
>> soon we'd hopefully have
>>
>>          security.ima@uid=100000
>>
> IIUC, the xattr names above should be parsed as:
>
>          security.(([ima|capability])@(uid=100000)
>
>> and eventually trusted.blarb@foo=bar
>>
> But the trusted xattr name should be parsed as:
>
>          (trusted.blarb)@(uid=100000)
>
> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.

To be precise, it looks at 'trusted.', including the dot.

>
> So we can write it like this, if it makes sense for the parser:
>          trusted@uid=100000.blarb

For the parser I think it would be easier to parse what Serge is 
proposing, and it would pass the existing xattr_is_trusted() call.



>
> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.
>
> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
>
>          security.ns@uid=1000@@.capability
>
> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged
> xattr name completely. This should suffice perfectly for the current
> capability V3 needs and is flexible enough to be extended later:
>
>          security.ns.user.1000.security.capability
> OR:
>          security.ns@uid=1000@@.security.capability
>
> And going forward, just as easy:
>
>          security.ns.user.1000.[trusted|system|user].foo
>
> Amir.
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
  2017-06-28  7:18         ` Amir Goldstein
  (?)
@ 2017-06-28 14:04           ` Stefan Berger
  -1 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-28 14:04 UTC (permalink / raw)
  To: Amir Goldstein, Serge E. Hallyn
  Cc: Eric W. Biederman, Linux Containers, lkp, xiaolong.ye,
	linux-kernel, Mimi Zohar, Tycho Andersen, James Bottomley,
	christian.brauner, Vivek Goyal, LSM List, Casey Schaufler

On 06/28/2017 03:18 AM, Amir Goldstein wrote:
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
>> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability@uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability@uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability@uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability@uid=1000 or
>>> ns@uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo@uid=0"
>>>
>>> Amir.
>> Hi Amir,
>>
>> I was liking the prefix at first, but I'm actually not sure it's worth
>> it.  THe main advantage would be so that checking for namespace or other
>> tags could be done always at the same offset simplifying the parser.
>> But since we will want to only handle namespacing for some tags, and
>> potentially differently for each task, it won't actually be simpler, I
>> don't think.
>>
>> On the other hand we do want to make sure that the syntax we use is
>> generally usable, so I think simply specifying that >1 tags can each
>> be separate by '@' should suffice.  So for now we'd only have
> Serge,
>
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
>
>>          security.capability@uid=100000
>>
>> soon we'd hopefully have
>>
>>          security.ima@uid=100000
>>
> IIUC, the xattr names above should be parsed as:
>
>          security.(([ima|capability])@(uid=100000)
>
>> and eventually trusted.blarb@foo=bar
>>
> But the trusted xattr name should be parsed as:
>
>          (trusted.blarb)@(uid=100000)
>
> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.

To be precise, it looks at 'trusted.', including the dot.

>
> So we can write it like this, if it makes sense for the parser:
>          trusted@uid=100000.blarb

For the parser I think it would be easier to parse what Serge is 
proposing, and it would pass the existing xattr_is_trusted() call.



>
> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.
>
> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
>
>          security.ns@uid=1000@@.capability
>
> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged
> xattr name completely. This should suffice perfectly for the current
> capability V3 needs and is flexible enough to be extended later:
>
>          security.ns.user.1000.security.capability
> OR:
>          security.ns@uid=1000@@.security.capability
>
> And going forward, just as easy:
>
>          security.ns.user.1000.[trusted|system|user].foo
>
> Amir.
>

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-28 14:04           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-28 14:04 UTC (permalink / raw)
  To: linux-security-module

On 06/28/2017 03:18 AM, Amir Goldstein wrote:
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
>> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability at uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability at uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability at uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability at uid=1000 or
>>> ns at uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo at uid=0"
>>>
>>> Amir.
>> Hi Amir,
>>
>> I was liking the prefix at first, but I'm actually not sure it's worth
>> it.  THe main advantage would be so that checking for namespace or other
>> tags could be done always at the same offset simplifying the parser.
>> But since we will want to only handle namespacing for some tags, and
>> potentially differently for each task, it won't actually be simpler, I
>> don't think.
>>
>> On the other hand we do want to make sure that the syntax we use is
>> generally usable, so I think simply specifying that >1 tags can each
>> be separate by '@' should suffice.  So for now we'd only have
> Serge,
>
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
>
>>          security.capability at uid=100000
>>
>> soon we'd hopefully have
>>
>>          security.ima at uid=100000
>>
> IIUC, the xattr names above should be parsed as:
>
>          security.(([ima|capability])@(uid=100000)
>
>> and eventually trusted.blarb at foo=bar
>>
> But the trusted xattr name should be parsed as:
>
>          (trusted.blarb)@(uid=100000)
>
> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.

To be precise, it looks at 'trusted.', including the dot.

>
> So we can write it like this, if it makes sense for the parser:
>          trusted at uid=100000.blarb

For the parser I think it would be easier to parse what Serge is 
proposing, and it would pass the existing xattr_is_trusted() call.



>
> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.
>
> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
>
>          security.ns at uid=1000@@.capability
>
> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged
> xattr name completely. This should suffice perfectly for the current
> capability V3 needs and is flexible enough to be extended later:
>
>          security.ns.user.1000.security.capability
> OR:
>          security.ns at uid=1000@@.security.capability
>
> And going forward, just as easy:
>
>          security.ns.user.1000.[trusted|system|user].foo
>
> Amir.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-28 14:04           ` Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-28 14:04 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 4370 bytes --]

On 06/28/2017 03:18 AM, Amir Goldstein wrote:
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
>> On Fri, Jun 23, 2017 at 10:01:46AM +0300, Amir Goldstein wrote:
>>> On Thu, Jun 22, 2017 at 9:59 PM, Stefan Berger
>>> <stefanb@linux.vnet.ibm.com> wrote:
>>>> This series of patches primary goal is to enable file capabilities
>>>> in user namespaces without affecting the file capabilities that are
>>>> effective on the host. This is to prevent that any unprivileged user
>>>> on the host maps his own uid to root in a private namespace, writes
>>>> the xattr, and executes the file with privilege on the host.
>>>>
>>>> We achieve this goal by writing extended attributes with a different
>>>> name when a user namespace is used. If for example the root user
>>>> in a user namespace writes the security.capability xattr, the name
>>>> of the xattr that is actually written is encoded as
>>>> security.capability(a)uid=1000 for root mapped to uid 1000 on the host.
>>>> When listing the xattrs on the host, the existing security.capability
>>>> as well as the security.capability(a)uid=1000 will be shown. Inside the
>>>> namespace only 'security.capability', with the value of
>>>> security.capability(a)uid=1000, is visible.
>>>>
>>> Am I the only one who thinks that suffix is perhaps not the best grammar
>>> to use for this namespace?
>>> xattrs are clearly namespaced by prefix, so it seems right to me to keep
>>> it that way - define a new special xattr namespace "ns" and only if that
>>> prefix exists, the @uid suffix will be parsed.
>>> This could be either  ns.security.capability(a)uid=1000 or
>>> ns(a)uid=1000.security.capability. The latter seems more correct to me,
>>> because then we will be able to namespace any xattr without having to
>>> protect from "unprivileged xattr injection", i.e.:
>>> setfattr -n "user.whatever.foo(a)uid=0"
>>>
>>> Amir.
>> Hi Amir,
>>
>> I was liking the prefix at first, but I'm actually not sure it's worth
>> it.  THe main advantage would be so that checking for namespace or other
>> tags could be done always at the same offset simplifying the parser.
>> But since we will want to only handle namespacing for some tags, and
>> potentially differently for each task, it won't actually be simpler, I
>> don't think.
>>
>> On the other hand we do want to make sure that the syntax we use is
>> generally usable, so I think simply specifying that >1 tags can each
>> be separate by '@' should suffice.  So for now we'd only have
> Serge,
>
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
>
>>          security.capability(a)uid=100000
>>
>> soon we'd hopefully have
>>
>>          security.ima(a)uid=100000
>>
> IIUC, the xattr names above should be parsed as:
>
>          security.(([ima|capability])@(uid=100000)
>
>> and eventually trusted.blarb(a)foo=bar
>>
> But the trusted xattr name should be parsed as:
>
>          (trusted.blarb)@(uid=100000)
>
> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.

To be precise, it looks at 'trusted.', including the dot.

>
> So we can write it like this, if it makes sense for the parser:
>          trusted(a)uid=100000.blarb

For the parser I think it would be easier to parse what Serge is 
proposing, and it would pass the existing xattr_is_trusted() call.



>
> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.
>
> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
>
>          security.ns(a)uid=1000@@.capability
>
> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged
> xattr name completely. This should suffice perfectly for the current
> capability V3 needs and is flexible enough to be extended later:
>
>          security.ns.user.1000.security.capability
> OR:
>          security.ns(a)uid=1000@@.security.capability
>
> And going forward, just as easy:
>
>          security.ns.user.1000.[trusted|system|user].foo
>
> Amir.
>


^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]         ` <CAOQ4uxhiSHEXzWN7=g-nmu=ebpv7hkXszW03JZ4UJkcjTeH+oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-06-28 14:04           ` Stefan Berger
@ 2017-06-28 14:28           ` Serge E. Hallyn
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28 14:28 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Mimi Zohar, Linux Containers, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel, James Bottomley, LSM List, Eric W. Biederman,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Casey Schaufler,
	lkp-JC7UmRfGjtg

Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote:
> > Hi Amir,
> >
> > I was liking the prefix at first, but I'm actually not sure it's worth
> > it.  THe main advantage would be so that checking for namespace or other
> > tags could be done always at the same offset simplifying the parser.
> > But since we will want to only handle namespacing for some tags, and
> > potentially differently for each task, it won't actually be simpler, I
> > don't think.
> >
> > On the other hand we do want to make sure that the syntax we use is
> > generally usable, so I think simply specifying that >1 tags can each
> > be separate by '@' should suffice.  So for now we'd only have
> 
> Serge,
> 
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
> 
> >
> >         security.capability@uid=100000
> >
> > soon we'd hopefully have
> >
> >         security.ima@uid=100000
> >
> 
> IIUC, the xattr names above should be parsed as:
> 
>         security.(([ima|capability])@(uid=100000)

not sure what you mean by the parentheses.  Point in these two
examples being that only uid= would be accepted as 'tags', and
only ima and capability would support the tag.  As we'd discussed
we might support uid= (with no number) as indicating any namespace
not mapping kuid 0 would work.

> > and eventually trusted.blarb@foo=bar
> >
> 
> But the trusted xattr name should be parsed as:
> 
>         (trusted.blarb)@(uid=100000)

Sorry, my point there wasn't trusted, I had meant to put in more
tags, which was the point:

	security.capability@uid=100000@smack=container_x

We don't yet know what smack= would mean, but we do know that at
some point there may be > 1 tags.

Importantly, in order to not limit our future behavior, for now
we would refuse writing >1 tags.  (That way we don't risk the problem,
in the future, that someone can boot an older kernel andw rite a xattr
without all the privileges which the new kernel requires.)

> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.
> 
> So we can write it like this, if it makes sense for the parser:
>         trusted@uid=100000.blarb

That's actually specifically what I'm arguing against.  I'm arguing
that the full xattr should come first, as we should not bother
checking for tags until we've verified that the xattr supports them.

> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.

Perhaps not for trusted, but perhaps it will.  And for security.*
it definately does.  Selinux does not support namespace tags.
Smack one day may, but perhaps with different behavior than ima.

> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
> 
>         security.ns@uid=1000@@.capability

Right I was going to reply to his email, but yours seemed to come
earlier so I picked it :)  I wasn't trying to pick on you :)

> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged

If we were going this route I definately would have preferred
security.ns@uid=1000@@.capability.

I've cc:d linux-api at this point as it seems the right thing to
do :)

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
       [not found]         ` <CAOQ4uxhiSHEXzWN7=g-nmu=ebpv7hkXszW03JZ4UJkcjTeH+oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-06-28 14:04           ` Stefan Berger
@ 2017-06-28 14:28           ` Serge E. Hallyn
  1 sibling, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28 14:28 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Serge E. Hallyn, Stefan Berger, Eric W. Biederman,
	Linux Containers, lkp, xiaolong.ye, linux-kernel, Mimi Zohar,
	Tycho Andersen, James Bottomley, christian.brauner, Vivek Goyal,
	LSM List, Casey Schaufler, linux-api

Quoting Amir Goldstein (amir73il@gmail.com):
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> > Hi Amir,
> >
> > I was liking the prefix at first, but I'm actually not sure it's worth
> > it.  THe main advantage would be so that checking for namespace or other
> > tags could be done always at the same offset simplifying the parser.
> > But since we will want to only handle namespacing for some tags, and
> > potentially differently for each task, it won't actually be simpler, I
> > don't think.
> >
> > On the other hand we do want to make sure that the syntax we use is
> > generally usable, so I think simply specifying that >1 tags can each
> > be separate by '@' should suffice.  So for now we'd only have
> 
> Serge,
> 
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
> 
> >
> >         security.capability@uid=100000
> >
> > soon we'd hopefully have
> >
> >         security.ima@uid=100000
> >
> 
> IIUC, the xattr names above should be parsed as:
> 
>         security.(([ima|capability])@(uid=100000)

not sure what you mean by the parentheses.  Point in these two
examples being that only uid= would be accepted as 'tags', and
only ima and capability would support the tag.  As we'd discussed
we might support uid= (with no number) as indicating any namespace
not mapping kuid 0 would work.

> > and eventually trusted.blarb@foo=bar
> >
> 
> But the trusted xattr name should be parsed as:
> 
>         (trusted.blarb)@(uid=100000)

Sorry, my point there wasn't trusted, I had meant to put in more
tags, which was the point:

	security.capability@uid=100000@smack=container_x

We don't yet know what smack= would mean, but we do know that at
some point there may be > 1 tags.

Importantly, in order to not limit our future behavior, for now
we would refuse writing >1 tags.  (That way we don't risk the problem,
in the future, that someone can boot an older kernel andw rite a xattr
without all the privileges which the new kernel requires.)

> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.
> 
> So we can write it like this, if it makes sense for the parser:
>         trusted@uid=100000.blarb

That's actually specifically what I'm arguing against.  I'm arguing
that the full xattr should come first, as we should not bother
checking for tags until we've verified that the xattr supports them.

> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.

Perhaps not for trusted, but perhaps it will.  And for security.*
it definately does.  Selinux does not support namespace tags.
Smack one day may, but perhaps with different behavior than ima.

> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
> 
>         security.ns@uid=1000@@.capability

Right I was going to reply to his email, but yours seemed to come
earlier so I picked it :)  I wasn't trying to pick on you :)

> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged

If we were going this route I definately would have preferred
security.ns@uid=1000@@.capability.

I've cc:d linux-api at this point as it seems the right thing to
do :)

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* Re: [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-28 14:28           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28 14:28 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Serge E. Hallyn, Stefan Berger, Eric W. Biederman,
	Linux Containers, lkp-JC7UmRfGjtg,
	xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w, linux-kernel, Mimi Zohar,
	Tycho Andersen, James Bottomley,
	christian.brauner-cl+VPiYnx/1AfugRpC6u6w, Vivek Goyal, LSM List,
	Casey Schaufler, linux-api-u79uwXL29TY76Z2rM5mHXA

Quoting Amir Goldstein (amir73il-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote:
> > Hi Amir,
> >
> > I was liking the prefix at first, but I'm actually not sure it's worth
> > it.  THe main advantage would be so that checking for namespace or other
> > tags could be done always at the same offset simplifying the parser.
> > But since we will want to only handle namespacing for some tags, and
> > potentially differently for each task, it won't actually be simpler, I
> > don't think.
> >
> > On the other hand we do want to make sure that the syntax we use is
> > generally usable, so I think simply specifying that >1 tags can each
> > be separate by '@' should suffice.  So for now we'd only have
> 
> Serge,
> 
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
> 
> >
> >         security.capability@uid=100000
> >
> > soon we'd hopefully have
> >
> >         security.ima@uid=100000
> >
> 
> IIUC, the xattr names above should be parsed as:
> 
>         security.(([ima|capability])@(uid=100000)

not sure what you mean by the parentheses.  Point in these two
examples being that only uid= would be accepted as 'tags', and
only ima and capability would support the tag.  As we'd discussed
we might support uid= (with no number) as indicating any namespace
not mapping kuid 0 would work.

> > and eventually trusted.blarb@foo=bar
> >
> 
> But the trusted xattr name should be parsed as:
> 
>         (trusted.blarb)@(uid=100000)

Sorry, my point there wasn't trusted, I had meant to put in more
tags, which was the point:

	security.capability@uid=100000@smack=container_x

We don't yet know what smack= would mean, but we do know that at
some point there may be > 1 tags.

Importantly, in order to not limit our future behavior, for now
we would refuse writing >1 tags.  (That way we don't risk the problem,
in the future, that someone can boot an older kernel andw rite a xattr
without all the privileges which the new kernel requires.)

> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.
> 
> So we can write it like this, if it makes sense for the parser:
>         trusted@uid=100000.blarb

That's actually specifically what I'm arguing against.  I'm arguing
that the full xattr should come first, as we should not bother
checking for tags until we've verified that the xattr supports them.

> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.

Perhaps not for trusted, but perhaps it will.  And for security.*
it definately does.  Selinux does not support namespace tags.
Smack one day may, but perhaps with different behavior than ima.

> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
> 
>         security.ns@uid=1000@@.capability

Right I was going to reply to his email, but yours seemed to come
earlier so I picked it :)  I wasn't trying to pick on you :)

> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged

If we were going this route I definately would have preferred
security.ns@uid=1000@@.capability.

I've cc:d linux-api at this point as it seems the right thing to
do :)

thanks,
-serge

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-28 14:28           ` Serge E. Hallyn
  0 siblings, 0 replies; 180+ messages in thread
From: Serge E. Hallyn @ 2017-06-28 14:28 UTC (permalink / raw)
  To: linux-security-module

Quoting Amir Goldstein (amir73il at gmail.com):
> On Wed, Jun 28, 2017 at 8:41 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> > Hi Amir,
> >
> > I was liking the prefix at first, but I'm actually not sure it's worth
> > it.  THe main advantage would be so that checking for namespace or other
> > tags could be done always at the same offset simplifying the parser.
> > But since we will want to only handle namespacing for some tags, and
> > potentially differently for each task, it won't actually be simpler, I
> > don't think.
> >
> > On the other hand we do want to make sure that the syntax we use is
> > generally usable, so I think simply specifying that >1 tags can each
> > be separate by '@' should suffice.  So for now we'd only have
> 
> Serge,
> 
> I am not sure I am parsing what you are saying correctly (pun intended).
> Can you give some examples of xattr names with several @.
> 
> >
> >         security.capability at uid=100000
> >
> > soon we'd hopefully have
> >
> >         security.ima at uid=100000
> >
> 
> IIUC, the xattr names above should be parsed as:
> 
>         security.(([ima|capability])@(uid=100000)

not sure what you mean by the parentheses.  Point in these two
examples being that only uid= would be accepted as 'tags', and
only ima and capability would support the tag.  As we'd discussed
we might support uid= (with no number) as indicating any namespace
not mapping kuid 0 would work.

> > and eventually trusted.blarb at foo=bar
> >
> 
> But the trusted xattr name should be parsed as:
> 
>         (trusted.blarb)@(uid=100000)

Sorry, my point there wasn't trusted, I had meant to put in more
tags, which was the point:

	security.capability at uid=100000 at smack=container_x

We don't yet know what smack= would mean, but we do know that at
some point there may be > 1 tags.

Importantly, in order to not limit our future behavior, for now
we would refuse writing >1 tags.  (That way we don't risk the problem,
in the future, that someone can boot an older kernel andw rite a xattr
without all the privileges which the new kernel requires.)

> Otherwise it won't be able to pass the xattr_is_trusted() test
> which looks only at the trusted prefix.
> 
> So we can write it like this, if it makes sense for the parser:
>         trusted at uid=100000.blarb

That's actually specifically what I'm arguing against.  I'm arguing
that the full xattr should come first, as we should not bother
checking for tags until we've verified that the xattr supports them.

> But I don't think that trusted.foo should have a different
> userns behavior than trusted.bar down the road.

Perhaps not for trusted, but perhaps it will.  And for security.*
it definately does.  Selinux does not support namespace tags.
Smack one day may, but perhaps with different behavior than ima.

> Admittedly, I am not so much of a security developer myself,
> so I prefer to let Casey be the spokesman for the '.ns' prefix.
> Casey's proposal seems right to me:
> 
>         security.ns at uid=1000@@.capability

Right I was going to reply to his email, but yours seemed to come
earlier so I picked it :)  I wasn't trying to pick on you :)

> We can also stick to a more conventional syntax of a perfect
> new namespace 'security.ns', which encapsulates the unprivileged

If we were going this route I definately would have preferred
security.ns at uid=1000@@.capability.

I've cc:d linux-api at this point as it seems the right thing to
do :)

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 180+ messages in thread

* [PATCH 0/3] Enable namespaced file capabilities
@ 2017-06-22 18:59 Stefan Berger
  0 siblings, 0 replies; 180+ messages in thread
From: Stefan Berger @ 2017-06-22 18:59 UTC (permalink / raw)
  To: ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: lkp-JC7UmRfGjtg, xiaolong.ye-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	zohar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

This series of patches primary goal is to enable file capabilities
in user namespaces without affecting the file capabilities that are
effective on the host. This is to prevent that any unprivileged user
on the host maps his own uid to root in a private namespace, writes
the xattr, and executes the file with privilege on the host.

We achieve this goal by writing extended attributes with a different
name when a user namespace is used. If for example the root user
in a user namespace writes the security.capability xattr, the name
of the xattr that is actually written is encoded as
security.capability@uid=1000 for root mapped to uid 1000 on the host.
When listing the xattrs on the host, the existing security.capability
as well as the security.capability@uid=1000 will be shown. Inside the
namespace only 'security.capability', with the value of
security.capability@uid=1000, is visible.

To maintain compatibility with existing behavior, the value of
security.capability of the host is shown inside the user namespace
once the security.capability of the user namespace has been removed
(which really removes security.capability@uid=1000). Writing to
an extended attribute inside a user namespace effectively hides the
extended attribute of the host.

The general framework that is established with these patches can
be applied to other extended attributes as well, such as security.ima
or the 'trusted.' prefix . Another extended attribute that needed to
be enabled here is 'security.selinux,' since otherwise this extended
attribute would not be shown anymore inside a user namespace.

Regards,
   Stefan & Serge


Stefan Berger (3):
  xattr: Enable security.capability in user namespaces
  Enable capabilities of files from shared filesystem
  Enable security.selinux in user namespaces

 fs/xattr.c               | 472 ++++++++++++++++++++++++++++++++++++++++++++++-
 security/commoncap.c     |  36 +++-
 security/selinux/hooks.c |   9 +-
 3 files changed, 501 insertions(+), 16 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 180+ messages in thread

end of thread, other threads:[~2017-06-28 14:28 UTC | newest]

Thread overview: 180+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-22 18:59 [PATCH 0/3] Enable namespaced file capabilities Stefan Berger
2017-06-22 18:59 ` Stefan Berger
2017-06-22 18:59 ` Stefan Berger
2017-06-22 18:59 ` [PATCH 1/3] xattr: Enable security.capability in user namespaces Stefan Berger
2017-06-22 18:59   ` Stefan Berger
2017-06-22 18:59   ` Stefan Berger
2017-06-24 21:02   ` kbuild test robot
2017-06-24 21:02     ` kbuild test robot
2017-06-24 21:02     ` kbuild test robot
     [not found]   ` <1498157989-11814-2-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-24 21:02     ` [PATCH] xattr: fix kstrdup.cocci warnings kbuild test robot
2017-06-24 21:02     ` [PATCH 1/3] xattr: Enable security.capability in user namespaces kbuild test robot
2017-06-24 21:02   ` [PATCH] xattr: fix kstrdup.cocci warnings kbuild test robot
2017-06-24 21:02     ` kbuild test robot
2017-06-24 21:02     ` kbuild test robot
     [not found] ` <1498157989-11814-1-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-22 18:59   ` [PATCH 1/3] xattr: Enable security.capability in user namespaces Stefan Berger
2017-06-22 18:59   ` [PATCH 2/3] Enable capabilities of files from shared filesystem Stefan Berger
2017-06-22 18:59     ` Stefan Berger
2017-06-22 18:59     ` Stefan Berger
2017-06-22 18:59     ` Stefan Berger
2017-06-22 18:59   ` [PATCH 3/3] Enable security.selinux in user namespaces Stefan Berger
2017-06-22 19:59   ` [PATCH 0/3] Enable namespaced file capabilities Casey Schaufler
2017-06-22 23:29   ` James Bottomley
2017-06-23  7:01   ` Amir Goldstein
2017-06-23  7:01     ` Amir Goldstein
2017-06-23  7:01     ` Amir Goldstein
2017-06-23 16:00     ` Serge E. Hallyn
2017-06-23 16:00     ` Serge E. Hallyn
2017-06-23 16:00       ` Serge E. Hallyn
     [not found]       ` <20170623160026.GA18257-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23 16:16         ` Casey Schaufler
2017-06-23 16:16           ` Casey Schaufler
2017-06-23 16:16           ` Casey Schaufler
2017-06-23 16:16           ` Casey Schaufler
     [not found]           ` <aa62373e-7cd6-39dd-2e38-2b6d6dbe18a8-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-06-23 16:30             ` Serge E. Hallyn
2017-06-23 18:08             ` Stefan Berger
2017-06-23 16:30           ` Serge E. Hallyn
2017-06-23 16:30             ` Serge E. Hallyn
2017-06-23 16:53             ` Casey Schaufler
2017-06-23 16:53               ` Casey Schaufler
2017-06-23 16:53               ` Casey Schaufler
2017-06-23 17:01               ` Serge E. Hallyn
2017-06-23 17:01                 ` Serge E. Hallyn
2017-06-23 17:49                 ` Eric W. Biederman
2017-06-23 17:49                   ` Eric W. Biederman
2017-06-23 17:49                   ` Eric W. Biederman
     [not found]                   ` <8760fmh9vc.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-06-23 18:32                     ` Serge E. Hallyn
2017-06-23 18:32                   ` Serge E. Hallyn
2017-06-23 18:32                     ` Serge E. Hallyn
     [not found]                 ` <20170623170108.GA19354-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23 17:49                   ` Eric W. Biederman
     [not found]               ` <ef37880d-6baa-12a6-eab1-bcd0a4e94d53-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-06-23 17:01                 ` Serge E. Hallyn
2017-06-23 17:07             ` James Bottomley
2017-06-23 17:07               ` James Bottomley
2017-06-23 17:07               ` James Bottomley
     [not found]               ` <1498237641.3641.15.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-06-23 17:18                 ` Aleksa Sarai
     [not found]                   ` <b57803da-0e8b-594d-901b-12eb509f04b5-l3A5Bk7waGM@public.gmane.org>
2017-06-23 18:22                     ` Serge E. Hallyn
2017-06-23 17:20                 ` Serge E. Hallyn
2017-06-23 17:20                   ` Serge E. Hallyn
2017-06-23 17:20                   ` Serge E. Hallyn
     [not found]                   ` <20170623172016.GA19551-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23 17:28                     ` Aleksa Sarai
     [not found]                       ` <553a72c4-eda9-52d6-2ae2-f8687c0c7c70-l3A5Bk7waGM@public.gmane.org>
2017-06-23 18:30                         ` Serge E. Hallyn
2017-06-25 12:35                         ` Eric W. Biederman
     [not found]                           ` <87lgogdz2t.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-06-25 13:06                             ` Aleksa Sarai
     [not found]                               ` <f1716e8c-dba8-a051-6bc7-285f13ffcaf0-l3A5Bk7waGM@public.gmane.org>
2017-06-25 13:28                                 ` Eric W. Biederman
     [not found]                                   ` <87zicwb3hu.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-06-25 13:51                                     ` Aleksa Sarai
     [not found]                                       ` <5bef361a-bc31-f3bc-f513-e728a48f0524-l3A5Bk7waGM@public.gmane.org>
2017-06-25 16:45                                         ` Serge E. Hallyn
     [not found]                                           ` <20170625164558.GA24471-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-26  6:14                                             ` Aleksa Sarai
2017-06-23 17:38                 ` Stefan Berger
2017-06-23 17:38               ` Stefan Berger
2017-06-23 17:38                 ` Stefan Berger
2017-06-23 17:38                 ` Stefan Berger
     [not found]                 ` <d288ea69-adec-f257-30cb-b1d9c57c996b-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-23 18:34                   ` Serge E. Hallyn
2017-06-23 18:34                 ` Serge E. Hallyn
2017-06-23 18:34                   ` Serge E. Hallyn
     [not found]             ` <20170623163030.GA18820-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23 16:53               ` Casey Schaufler
2017-06-23 17:07               ` James Bottomley
2017-06-23 18:08           ` Stefan Berger
2017-06-23 18:08             ` Stefan Berger
2017-06-23 18:08             ` Stefan Berger
     [not found]             ` <3404c486-c848-3283-50f7-2283cb631e8e-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-23 18:35               ` Serge E. Hallyn
2017-06-23 18:35             ` Serge E. Hallyn
2017-06-23 18:35               ` Serge E. Hallyn
     [not found]               ` <20170623183520.GC21137-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23 20:30                 ` Casey Schaufler
2017-06-23 20:30                   ` Casey Schaufler
2017-06-23 20:30                   ` Casey Schaufler
2017-06-23 20:30                   ` Casey Schaufler
2017-06-23 23:09                 ` Stefan Berger
2017-06-23 23:09                   ` Stefan Berger
2017-06-23 23:09                   ` Stefan Berger
2017-06-23 23:09                   ` Stefan Berger
     [not found]                   ` <da083027-fcd4-bc08-2d88-93034ba1cacc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-23 23:51                     ` Casey Schaufler
2017-06-23 23:51                       ` Casey Schaufler
2017-06-23 23:51                       ` Casey Schaufler
2017-06-23 23:51                       ` Casey Schaufler
2017-06-28  5:41     ` Serge E. Hallyn
2017-06-28  5:41     ` Serge E. Hallyn
2017-06-28  5:41       ` Serge E. Hallyn
     [not found]       ` <20170628054138.GA15939-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-28  7:18         ` Amir Goldstein
2017-06-28  7:18       ` Amir Goldstein
2017-06-28  7:18         ` Amir Goldstein
2017-06-28 14:04         ` Stefan Berger
2017-06-28 14:04           ` Stefan Berger
2017-06-28 14:04           ` Stefan Berger
     [not found]         ` <CAOQ4uxhiSHEXzWN7=g-nmu=ebpv7hkXszW03JZ4UJkcjTeH+oQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-28 14:04           ` Stefan Berger
2017-06-28 14:28           ` Serge E. Hallyn
2017-06-28 14:28         ` Serge E. Hallyn
2017-06-28 14:28           ` Serge E. Hallyn
2017-06-28 14:28           ` Serge E. Hallyn
2017-06-23 20:09   ` Vivek Goyal
2017-06-23 20:09     ` Vivek Goyal
2017-06-23 20:09     ` Vivek Goyal
2017-06-23 20:09     ` Vivek Goyal
     [not found]     ` <20170623200956.GB24779-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-06-23 20:17       ` Serge E. Hallyn
2017-06-23 20:17     ` Serge E. Hallyn
2017-06-23 20:17       ` Serge E. Hallyn
2017-06-23 20:36       ` Vivek Goyal
2017-06-23 20:36         ` Vivek Goyal
2017-06-23 20:36         ` Vivek Goyal
2017-06-23 20:51         ` Serge E. Hallyn
2017-06-23 20:51           ` Serge E. Hallyn
     [not found]         ` <20170623203643.GC24779-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-06-23 20:51           ` Serge E. Hallyn
     [not found]       ` <20170623201723.GA22857-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23 20:36         ` Vivek Goyal
2017-06-22 18:59 ` [PATCH 3/3] Enable security.selinux in user namespaces Stefan Berger
2017-06-22 18:59   ` Stefan Berger
2017-06-22 18:59   ` Stefan Berger
     [not found]   ` <1498157989-11814-4-git-send-email-stefanb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-23 20:30     ` Stephen Smalley
2017-06-23 20:30       ` Stephen Smalley
2017-06-23 20:30       ` Stephen Smalley
2017-06-23 20:30       ` Stephen Smalley
     [not found]       ` <1498249800.2063.9.camel-+05T5uksL2qpZYMLLGbcSA@public.gmane.org>
2017-06-23 23:41         ` Stefan Berger
2017-06-23 23:41           ` Stefan Berger
2017-06-23 23:41           ` Stefan Berger
2017-06-23 23:41           ` Stefan Berger
2017-06-22 19:59 ` [PATCH 0/3] Enable namespaced file capabilities Casey Schaufler
2017-06-22 19:59   ` Casey Schaufler
2017-06-22 19:59   ` Casey Schaufler
2017-06-22 20:12   ` Stefan Berger
2017-06-22 20:12     ` Stefan Berger
2017-06-22 20:12     ` Stefan Berger
     [not found]     ` <10fb9c1b-e9af-336c-9a1b-cf95259cfaf3-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-06-22 20:33       ` Casey Schaufler
2017-06-22 20:33     ` Casey Schaufler
2017-06-22 20:33       ` Casey Schaufler
2017-06-22 20:33       ` Casey Schaufler
     [not found]       ` <2bf08b3e-27f4-3592-d5e2-a823401ac644-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-06-22 21:03         ` Stefan Berger
2017-06-22 21:03           ` Stefan Berger
2017-06-22 21:03           ` Stefan Berger
2017-06-22 21:03           ` Stefan Berger
2017-06-22 21:09         ` Serge E. Hallyn
2017-06-22 21:09           ` Serge E. Hallyn
2017-06-22 21:09           ` Serge E. Hallyn
     [not found]           ` <20170622210925.GA32691-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-22 22:40             ` Casey Schaufler
2017-06-22 22:40           ` Casey Schaufler
2017-06-22 22:40             ` Casey Schaufler
2017-06-22 22:40             ` Casey Schaufler
     [not found]             ` <45e59e2e-0e00-cb9a-2f85-dc4606338a08-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-06-22 23:07               ` Serge E. Hallyn
2017-06-22 23:07             ` Serge E. Hallyn
2017-06-22 23:07               ` Serge E. Hallyn
     [not found]   ` <70a09f1b-e82c-a25c-9325-d5d757b1b695-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-06-22 20:12     ` Stefan Berger
2017-06-22 23:29 ` James Bottomley
2017-06-22 23:29   ` James Bottomley
2017-06-22 23:29   ` James Bottomley
2017-06-22 23:32   ` Serge E. Hallyn
2017-06-22 23:32     ` Serge E. Hallyn
     [not found]   ` <1498174161.7636.4.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-06-22 23:32     ` Serge E. Hallyn
2017-06-22 23:36     ` Serge E. Hallyn
2017-06-22 23:36   ` Serge E. Hallyn
2017-06-22 23:36     ` Serge E. Hallyn
2017-06-23  0:13     ` James Bottomley
2017-06-23  0:13       ` James Bottomley
2017-06-23  0:13       ` James Bottomley
     [not found]       ` <1498176787.7636.11.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-06-23  1:19         ` Serge E. Hallyn
2017-06-23 17:37         ` Eric W. Biederman
2017-06-23  1:19       ` Serge E. Hallyn
2017-06-23  1:19         ` Serge E. Hallyn
2017-06-23 17:37       ` Eric W. Biederman
2017-06-23 17:37         ` Eric W. Biederman
2017-06-23 17:37         ` Eric W. Biederman
     [not found]         ` <87efuaip08.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-06-23 18:39           ` Serge E. Hallyn
2017-06-23 18:39         ` Serge E. Hallyn
2017-06-23 18:39           ` Serge E. Hallyn
     [not found]     ` <20170622233619.GC2894-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2017-06-23  0:13       ` James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2017-06-22 18:59 Stefan Berger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.