linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Stefan Berger <stefanb@linux.ibm.com>
Cc: linux-integrity@vger.kernel.org, zohar@linux.ibm.com,
	serge@hallyn.com, christian.brauner@ubuntu.com,
	containers@lists.linux.dev, dmitry.kasatkin@gmail.com,
	ebiederm@xmission.com, krzysztof.struczynski@huawei.com,
	roberto.sassu@huawei.com, mpeters@redhat.com, lhinds@redhat.com,
	lsturman@redhat.com, puiterwi@redhat.com, jejb@linux.ibm.com,
	jamjoom@us.ibm.com, linux-kernel@vger.kernel.org,
	paul@paul-moore.com, rgb@redhat.com,
	linux-security-module@vger.kernel.org, jmorris@namei.org,
	jpenumak@redhat.com, Christian Brauner <brauner@kernel.org>,
	James Bottomley <James.Bottomley@HansenPartnership.com>
Subject: Re: [PATCH v12 02/26] securityfs: Extend securityfs with namespacing support
Date: Fri, 20 May 2022 21:23:02 -0500	[thread overview]
Message-ID: <20220521022302.GA8575@mail.hallyn.com> (raw)
In-Reply-To: <20220420140633.753772-3-stefanb@linux.ibm.com>

On Wed, Apr 20, 2022 at 10:06:09AM -0400, Stefan Berger wrote:
> Enable multiple instances of securityfs by keying each instance with a
> pointer to the user namespace it belongs to.
> 
> Since we do not need the pinning of the filesystem for the virtualization
> case, limit the usage of simple_pin_fs() and simpe_release_fs() to the
> case when the init_user_ns is active. This simplifies the cleanup for the
> virtualization case where usage of securityfs_remove() to free dentries
> is therefore not needed anymore.
> 
> For the initial securityfs, i.e. the one mounted in the host userns mount,
> nothing changes. The rules for securityfs_remove() are as before and it is
> still paired with securityfs_create(). Specifically, a file created via
> securityfs_create_dentry() in the initial securityfs mount still needs to
> be removed by a call to securityfs_remove(). Creating a new dentry in the
> initial securityfs mount still pins the filesystem like it always did.
> Consequently, the initial securityfs mount is not destroyed on
> umount/shutdown as long as at least one user of it still has dentries that
> it hasn't removed with a call to securityfs_remove().
> 
> Prevent mounting of an instance of securityfs in another user namespace
> than it belongs to. Also, prevent accesses to files and directories by
> a user namespace that is neither the user namespace it belongs to
> nor an ancestor of the user namespace that the instance of securityfs
> belongs to. Do not prevent access if securityfs was bind-mounted and
> therefore the init_user_ns is the owning user namespace.
> 
> Suggested-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
> 
> ---
> v11:
>  - Formatted comment's first line to be '/*'
> ---
>  security/inode.c | 73 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 62 insertions(+), 11 deletions(-)
> 
> diff --git a/security/inode.c b/security/inode.c
> index 13e6780c4444..84c9396792a9 100644
> --- a/security/inode.c
> +++ b/security/inode.c
> @@ -21,9 +21,38 @@
>  #include <linux/security.h>
>  #include <linux/lsm_hooks.h>
>  #include <linux/magic.h>
> +#include <linux/user_namespace.h>
>  
> -static struct vfsmount *mount;
> -static int mount_count;
> +static struct vfsmount *init_securityfs_mount;
> +static int init_securityfs_mount_count;
> +
> +static int securityfs_permission(struct user_namespace *mnt_userns,
> +				 struct inode *inode, int mask)
> +{
> +	int err;
> +
> +	err = generic_permission(&init_user_ns, inode, mask);
> +	if (!err) {
> +		/*
> +		 * Unless bind-mounted, deny access if current_user_ns() is not
> +		 * ancestor.

This comment has confused me the last few times I looked at this.  I see
now you're using "bind-mounted" as a shortcut for saying "bind mounted from
the init_user_ns into a child_user_ns container".  I do think that needs
to be made clearer in this comment.

Should the init_user_ns really be special here?  What if I'm running a
first level container with uptodate userspace that mounts its own
securityfs, but in that i want to run a nested older userspace that
bind mounts the parent securityfs?  Is there a good reason to deny that?

It would seem to me the better check would be

	if (!is_original_mounter_of(current_user_ns, inode->i_sb->s_user_ns) &&
	     !in_userns(current_user_ns(), inode->i_sb->s_user_ns))
		err = -EACCESS;

the is_original_mounter_of() would require the user_ns to cache first
its parent securityfs userns, and, when a task in the user_ns mounts
securityfs, then cache its own userns.  (without a reference).
If current_user_ns() has mounted a securityfs for a user_ns other than
inode->i_sb->s_user_ns (or init_user_ns), then reject the mount.
Otherwise check current_user_ns()->parent, etc, until init_user_ns.
If you reach init_user_ns, or an ns which mounted inode->i_sb->s_user_ns,
then allow, else deny.

It's the kind of special casing we've worked hard to avoid in other
namespaces.

> +		 */
> +		if (inode->i_sb->s_user_ns != &init_user_ns &&
> +		    !in_userns(current_user_ns(), inode->i_sb->s_user_ns))
> +			err = -EACCES;
> +	}
> +
> +	return err;
> +}
> +
> +static const struct inode_operations securityfs_dir_inode_operations = {
> +	.permission	= securityfs_permission,
> +	.lookup		= simple_lookup,
> +};
> +
> +static const struct inode_operations securityfs_file_inode_operations = {
> +	.permission	= securityfs_permission,
> +};
>  
>  static void securityfs_free_inode(struct inode *inode)
>  {
> @@ -40,20 +69,25 @@ static const struct super_operations securityfs_super_operations = {
>  static int securityfs_fill_super(struct super_block *sb, struct fs_context *fc)
>  {
>  	static const struct tree_descr files[] = {{""}};
> +	struct user_namespace *ns = fc->user_ns;
>  	int error;
>  
> +	if (WARN_ON(ns != current_user_ns()))
> +		return -EINVAL;
> +
>  	error = simple_fill_super(sb, SECURITYFS_MAGIC, files);
>  	if (error)
>  		return error;
>  
>  	sb->s_op = &securityfs_super_operations;
> +	sb->s_root->d_inode->i_op = &securityfs_dir_inode_operations;
>  
>  	return 0;
>  }
>  
>  static int securityfs_get_tree(struct fs_context *fc)
>  {
> -	return get_tree_single(fc, securityfs_fill_super);
> +	return get_tree_keyed(fc, securityfs_fill_super, fc->user_ns);
>  }
>  
>  static const struct fs_context_operations securityfs_context_ops = {
> @@ -71,6 +105,7 @@ static struct file_system_type fs_type = {
>  	.name =		"securityfs",
>  	.init_fs_context = securityfs_init_fs_context,
>  	.kill_sb =	kill_litter_super,
> +	.fs_flags =	FS_USERNS_MOUNT,
>  };
>  
>  /**
> @@ -109,6 +144,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>  					const struct file_operations *fops,
>  					const struct inode_operations *iops)
>  {
> +	struct user_namespace *ns = current_user_ns();
>  	struct dentry *dentry;
>  	struct inode *dir, *inode;
>  	int error;
> @@ -118,12 +154,19 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>  
>  	pr_debug("securityfs: creating file '%s'\n",name);
>  
> -	error = simple_pin_fs(&fs_type, &mount, &mount_count);
> -	if (error)
> -		return ERR_PTR(error);
> +	if (ns == &init_user_ns) {
> +		error = simple_pin_fs(&fs_type, &init_securityfs_mount,
> +				      &init_securityfs_mount_count);

So ...  it's less work for the kernel to skip the simple_pin_fs()
here, but it's more code, and more confusing code, to skip it.

So I just want to ask, to make sure:  is it worth it?  Or should
it just be done for all namespaces here (and below and for release),
for shorter, simpler, easier to read and grok code?

> +		if (error)
> +			return ERR_PTR(error);
> +	}
>  
> -	if (!parent)
> -		parent = mount->mnt_root;
> +	if (!parent) {
> +		if (ns == &init_user_ns)
> +			parent = init_securityfs_mount->mnt_root;
> +		else
> +			return ERR_PTR(-EINVAL);
> +	}
>  
>  	dir = d_inode(parent);
>  
> @@ -148,7 +191,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>  	inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
>  	inode->i_private = data;
>  	if (S_ISDIR(mode)) {
> -		inode->i_op = &simple_dir_inode_operations;
> +		inode->i_op = &securityfs_dir_inode_operations;
>  		inode->i_fop = &simple_dir_operations;
>  		inc_nlink(inode);
>  		inc_nlink(dir);
> @@ -156,6 +199,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>  		inode->i_op = iops ? iops : &simple_symlink_inode_operations;
>  		inode->i_link = data;
>  	} else {
> +		inode->i_op = &securityfs_file_inode_operations;
>  		inode->i_fop = fops;
>  	}
>  	d_instantiate(dentry, inode);
> @@ -167,7 +211,9 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
>  	dentry = ERR_PTR(error);
>  out:
>  	inode_unlock(dir);
> -	simple_release_fs(&mount, &mount_count);
> +	if (ns == &init_user_ns)
> +		simple_release_fs(&init_securityfs_mount,
> +				  &init_securityfs_mount_count);
>  	return dentry;
>  }
>  
> @@ -293,11 +339,14 @@ EXPORT_SYMBOL_GPL(securityfs_create_symlink);
>   */
>  void securityfs_remove(struct dentry *dentry)
>  {
> +	struct user_namespace *ns;
>  	struct inode *dir;
>  
>  	if (!dentry || IS_ERR(dentry))
>  		return;
>  
> +	ns = dentry->d_sb->s_user_ns;
> +
>  	dir = d_inode(dentry->d_parent);
>  	inode_lock(dir);
>  	if (simple_positive(dentry)) {
> @@ -310,7 +359,9 @@ void securityfs_remove(struct dentry *dentry)
>  		dput(dentry);
>  	}
>  	inode_unlock(dir);
> -	simple_release_fs(&mount, &mount_count);
> +	if (ns == &init_user_ns)
> +		simple_release_fs(&init_securityfs_mount,
> +				  &init_securityfs_mount_count);
>  }
>  EXPORT_SYMBOL_GPL(securityfs_remove);
>  
> -- 
> 2.34.1

  reply	other threads:[~2022-05-21  2:23 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-20 14:06 [PATCH v12 00/26] ima: Namespace IMA with audit support in IMA-ns Stefan Berger
2022-04-20 14:06 ` [PATCH v12 01/26] securityfs: rework dentry creation Stefan Berger
2022-05-09 19:54   ` Serge E. Hallyn
2022-05-09 20:36     ` Serge E. Hallyn
2022-05-10  8:43       ` Amir Goldstein
2022-05-10 10:38         ` Christian Brauner
2022-05-10 14:51           ` Serge E. Hallyn
2022-05-10 14:53         ` Serge E. Hallyn
2022-05-10 10:26       ` Christian Brauner
2022-05-10 10:25     ` Christian Brauner
2022-05-10 14:10       ` Serge E. Hallyn
2022-05-10 15:51         ` Christian Brauner
2022-05-10 18:51           ` Serge E. Hallyn
2022-05-10 20:41           ` Serge E. Hallyn
2022-06-09 14:27             ` Mimi Zohar
2022-05-10 16:50       ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 02/26] securityfs: Extend securityfs with namespacing support Stefan Berger
2022-05-21  2:23   ` Serge E. Hallyn [this message]
2022-05-21  9:38     ` Christian Brauner
2022-05-21 15:09       ` Serge E. Hallyn
2022-07-07 14:34     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 03/26] ima: Define ima_namespace struct and start moving variables into it Stefan Berger
2022-05-21  2:33   ` Serge E. Hallyn
2022-05-24 14:57     ` Stefan Berger
2022-05-24 15:05       ` Serge E. Hallyn
2022-05-24 16:18     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 04/26] ima: Move arch_policy_entry into ima_namespace Stefan Berger
2022-05-21  2:46   ` Serge E. Hallyn
2022-05-21  3:07     ` Serge E. Hallyn
2022-07-07 14:12     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 05/26] ima: Move ima_htable " Stefan Berger
2022-05-21  2:50   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 06/26] ima: Move measurement list related variables " Stefan Berger
2022-05-21  2:55   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 07/26] ima: Move some IMA policy and filesystem " Stefan Berger
2022-05-21  3:03   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 08/26] ima: Move IMA securityfs files into ima_namespace or onto stack Stefan Berger
2022-05-21  3:24   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 09/26] ima: Move ima_lsm_policy_notifier into ima_namespace Stefan Berger
2022-05-22  2:35   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 10/26] ima: Switch to lazy lsm policy updates for better performance Stefan Berger
2022-05-22 17:06   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 11/26] ima: Define mac_admin_ns_capable() as a wrapper for ns_capable() Stefan Berger
2022-05-22 17:31   ` Serge E. Hallyn
2022-05-24 14:17     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 12/26] ima: Only accept AUDIT rules for non-init_ima_ns namespaces for now Stefan Berger
2022-05-22 17:38   ` Serge E. Hallyn
2022-05-24 13:25     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 13/26] userns: Add pointer to ima_namespace to user_namespace Stefan Berger
2022-05-22 18:24   ` Serge E. Hallyn
2022-05-23  9:59     ` Christian Brauner
2022-05-23 11:31       ` Stefan Berger
2022-05-23 12:41         ` Christian Brauner
2022-05-23 12:58           ` Stefan Berger
2022-05-23 14:25           ` Serge E. Hallyn
2022-07-07 14:14             ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 14/26] ima: Implement hierarchical processing of file accesses Stefan Berger
2022-05-23  0:42   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 15/26] ima: Implement ima_free_policy_rules() for freeing of an ima_namespace Stefan Berger
2022-05-23  0:43   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 16/26] ima: Add functions for creating and " Stefan Berger
2022-05-30  1:07   ` Serge E. Hallyn
2022-04-20 14:06 ` [PATCH v12 17/26] integrity/ima: Define ns_status for storing namespaced iint data Stefan Berger
2022-04-20 14:06 ` [PATCH v12 18/26] integrity: Add optional callback function to integrity_inode_free() Stefan Berger
2022-04-20 14:06 ` [PATCH v12 19/26] ima: Namespace audit status flags Stefan Berger
2022-04-20 14:06 ` [PATCH v12 20/26] ima: Remove unused iints from the integrity_iint_cache Stefan Berger
2022-04-20 14:06 ` [PATCH v12 21/26] ima: Setup securityfs for IMA namespace Stefan Berger
2022-05-30  1:16   ` Serge E. Hallyn
2022-05-31 19:26     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 22/26] ima: Introduce securityfs file to activate an " Stefan Berger
2022-04-20 14:06 ` [PATCH v12 23/26] ima: Show owning user namespace's uid and gid when displaying policy Stefan Berger
2022-05-22 17:54   ` Serge E. Hallyn
2022-05-24 13:19     ` Stefan Berger
2022-04-20 14:06 ` [PATCH v12 24/26] ima: Limit number of policy rules in non-init_ima_ns Stefan Berger
2022-04-20 14:06 ` [PATCH v12 25/26] ima: Restrict informational audit messages to init_ima_ns Stefan Berger
2022-04-20 14:06 ` [PATCH v12 26/26] ima: Enable IMA namespaces Stefan Berger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220521022302.GA8575@mail.hallyn.com \
    --to=serge@hallyn.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=brauner@kernel.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux.dev \
    --cc=dmitry.kasatkin@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=jamjoom@us.ibm.com \
    --cc=jejb@linux.ibm.com \
    --cc=jmorris@namei.org \
    --cc=jpenumak@redhat.com \
    --cc=krzysztof.struczynski@huawei.com \
    --cc=lhinds@redhat.com \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=lsturman@redhat.com \
    --cc=mpeters@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=puiterwi@redhat.com \
    --cc=rgb@redhat.com \
    --cc=roberto.sassu@huawei.com \
    --cc=stefanb@linux.ibm.com \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).