linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
@ 2018-06-22 17:16 Mark Salyzyn
  2018-06-23  6:46 ` Amir Goldstein
  2018-06-25 12:38 ` Vivek Goyal
  0 siblings, 2 replies; 7+ messages in thread
From: Mark Salyzyn @ 2018-06-22 17:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Salyzyn, Miklos Szeredi, Jonathan Corbet, Vivek Goyal,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap, linux-unionfs,
	linux-doc

By default, all access to the upper, lower and work directories is the
recorded mounter's MAC and DAC credentials.  The incoming accesses are
checked against the caller's credentials.

If the principles of least privilege are applied, the mounter's
credentials might not overlap the credentials of the caller's when
accessing the overlayfs filesystem.  For example, a file that a lower
DAC privileged caller can execute, is MAC denied to the generally
higher DAC privileged mounter, to prevent an attack vector.

We add the option to turn off override_creds in the mount options; all
subsequent operations after mount on the filesystem will be only the
caller's credentials.  This option default is set in the CONFIG
OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.

The module boolean parameter and mount option override_creds is also
added as a presence check for this "feature" by checking existence of
/sys/module/overlay/parameters/overlay_creds.  This will allow user
space to determine if the option can be supplied successfully to the
mount(2) operation.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
v2:
- Forward port changed attr to stat, resulting in a build error.
- altered commit message.

v3:
- Change name from caller_credentials / creator_credentials to the
  boolean override_creds.
- Changed from creator to mounter credentials.
- Updated and fortified the documentation.
- Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS

v4:
- spelling and grammar errors in text

 Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
 fs/overlayfs/Kconfig                    | 20 ++++++++++++++++++++
 fs/overlayfs/copy_up.c                  |  2 +-
 fs/overlayfs/dir.c                      |  9 +++++----
 fs/overlayfs/inode.c                    | 16 ++++++++--------
 fs/overlayfs/namei.c                    |  6 +++---
 fs/overlayfs/overlayfs.h                |  1 +
 fs/overlayfs/ovl_entry.h                |  1 +
 fs/overlayfs/readdir.c                  |  4 ++--
 fs/overlayfs/super.c                    | 21 +++++++++++++++++++++
 fs/overlayfs/util.c                     | 12 ++++++++++--
 11 files changed, 89 insertions(+), 20 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 72615a2c0752..18e6d70ea4c9 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -106,6 +106,23 @@ Only the lists of names from directories are merged.  Other content
 such as metadata and extended attributes are reported for the upper
 directory only.  These attributes of the lower directory are hidden.
 
+credentials
+-----------
+
+By default, all access to the upper, lower and work directories is the
+recorded mounter's MAC and DAC credentials.  The incoming accesses are
+checked against the caller's credentials.
+
+If the principles of least privilege are applied, the mounter's
+credentials might not overlap the credentials of the caller's when
+accessing the overlayfs filesystem.  For example, a file that a lower
+DAC privileged caller can execute, is MAC denied to the generally
+higher DAC privileged mounter, to prevent an attack vector.  One
+option is to turn off override_creds in the mount options; all
+subsequent operations after mount on the filesystem will be only the
+caller's credentials.  This option default is set in the CONFIG
+OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
+
 whiteouts and opaque directories
 --------------------------------
 
diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 9384164253ac..d21dde046b8d 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -103,3 +103,23 @@ config OVERLAY_FS_XINO_AUTO
 	  For more information, see Documentation/filesystems/overlayfs.txt
 
 	  If unsure, say N.
+
+config OVERLAY_FS_OVERRIDE_CREDS
+	bool "Overlay filesystem override credentials"
+	depends on OVERLAY_FS
+	default y
+	help
+	  If set, all access to the upper, lower and work directories is the
+	  recorded mounter's MAC and DAC credentials.  The incoming accesses
+	  are checked against the caller's credentials.
+
+	  If the principles of least privilege are applied, the mounter's
+	  credentials might not overlap the credentials of the caller's when
+	  accessing the overlayfs filesystem.  The mount option
+	  "override_creds=off" drops the mounter's credential check, so that
+	  all subsequent operations, after mount, on the filesystem will only
+	  be the caller's credentials.  This option sets the default for the
+	  module option override_creds, and thus the default for all mounts
+	  that do not specify this option.
+
+	  For more information see Documentation/filesystems/overlayfs.txt
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index ddaddb4ce4c3..7a841718ff2e 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -790,7 +790,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		dput(parent);
 		dput(next);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index f480b1a2cd2e..a9f10cd38e32 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -561,7 +561,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 		override_cred->fsgid = inode->i_gid;
 		if (!attr->hardlink) {
 			err = security_dentry_create_files_as(dentry,
-					attr->mode, &dentry->d_name, old_cred,
+					attr->mode, &dentry->d_name,
+					old_cred ? old_cred : current_cred(),
 					override_cred);
 			if (err) {
 				put_cred(override_cred);
@@ -577,7 +578,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 			err = ovl_create_over_whiteout(dentry, inode, attr);
 	}
 out_revert_creds:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return err;
 }
 
@@ -824,7 +825,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 		err = ovl_remove_upper(dentry, is_dir, &list);
 	else
 		err = ovl_remove_and_whiteout(dentry, &list);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (!err) {
 		if (is_dir)
 			clear_nlink(dentry->d_inode);
@@ -1150,7 +1151,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 out_unlock:
 	unlock_rename(new_upperdir, old_upperdir);
 out_revert_creds:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	ovl_nlink_end(new, locked);
 out_drop_write:
 	ovl_drop_write(old);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index ed16a898caeb..afb0af1a24e9 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -49,7 +49,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		inode_lock(upperdentry->d_inode);
 		old_cred = ovl_override_creds(dentry->d_sb);
 		err = notify_change(upperdentry, attr, NULL);
-		revert_creds(old_cred);
+		ovl_revert_creds(old_cred);
 		if (!err)
 			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
 		inode_unlock(upperdentry->d_inode);
@@ -208,7 +208,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		stat->nlink = dentry->d_inode->i_nlink;
 
 out:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -242,7 +242,7 @@ int ovl_permission(struct inode *inode, int mask)
 		mask |= MAY_READ;
 	}
 	err = inode_permission(realinode, mask);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -259,7 +259,7 @@ static const char *ovl_get_link(struct dentry *dentry,
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	p = vfs_get_link(ovl_dentry_real(dentry), done);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return p;
 }
 
@@ -302,7 +302,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
 		WARN_ON(flags != XATTR_REPLACE);
 		err = vfs_removexattr(realdentry, name);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 out_drop_write:
 	ovl_drop_write(dentry);
@@ -320,7 +320,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	res = vfs_getxattr(realdentry, name, value, size);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return res;
 }
 
@@ -344,7 +344,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	res = vfs_listxattr(realdentry, list, size);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (res <= 0 || size == 0)
 		return res;
 
@@ -379,7 +379,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 
 	old_cred = ovl_override_creds(inode->i_sb);
 	acl = get_acl(realinode, type);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return acl;
 }
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index c993dd8db739..c53e0b127332 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1024,7 +1024,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		OVL_I(inode)->redirect = upperredirect;
 	}
 
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
@@ -1043,7 +1043,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	kfree(upperredirect);
 out:
 	kfree(d.redirect);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return ERR_PTR(err);
 }
 
@@ -1097,7 +1097,7 @@ bool ovl_lower_positive(struct dentry *dentry)
 			dput(this);
 		}
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return positive;
 }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 7538b9b56237..81968e574264 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -195,6 +195,7 @@ int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
+void ovl_revert_creds(const struct cred *oldcred);
 struct super_block *ovl_same_sb(struct super_block *sb);
 int ovl_can_decode_fh(struct super_block *sb);
 struct dentry *ovl_indexdir(struct super_block *sb);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 41655a7d6894..ee4cc3802147 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -19,6 +19,7 @@ struct ovl_config {
 	bool index;
 	bool nfs_export;
 	int xino;
+	bool override_creds;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index ef1fe42ff7bb..150c7ee2f7f7 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
 		}
 		inode_unlock(dir->d_inode);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -906,7 +906,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	err = ovl_dir_read_merged(dentry, list, &root);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (err)
 		return err;
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 704b37311467..9f1e0cc85d27 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -56,6 +56,12 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
 MODULE_PARM_DESC(ovl_xino_auto_def,
 		 "Auto enable xino feature");
 
+static bool __read_mostly ovl_default_override_creds =
+	IS_ENABLED(CONFIG_OVERLAY_FS_OVERRIDE_CREDS);
+module_param_named(override_creds, ovl_default_override_creds, bool, 0644);
+MODULE_PARM_DESC(ovl_default_override_creds,
+		 "Use mounter's credentials for accesses");
+
 static void ovl_entry_stack_free(struct ovl_entry *oe)
 {
 	unsigned int i;
@@ -376,6 +382,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 						"on" : "off");
 	if (ofs->config.xino != ovl_xino_def())
 		seq_printf(m, ",xino=%s", ovl_xino_str[ofs->config.xino]);
+	seq_show_option(m, "override_creds",
+			ofs->config.override_creds ? "on" : "off");
 	return 0;
 }
 
@@ -413,6 +421,8 @@ enum {
 	OPT_XINO_ON,
 	OPT_XINO_OFF,
 	OPT_XINO_AUTO,
+	OPT_OVERRIDE_CREDS_ON,
+	OPT_OVERRIDE_CREDS_OFF,
 	OPT_ERR,
 };
 
@@ -429,6 +439,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_ON,			"xino=on"},
 	{OPT_XINO_OFF,			"xino=off"},
 	{OPT_XINO_AUTO,			"xino=auto"},
+	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
+	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -485,6 +497,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
 	if (!config->redirect_mode)
 		return -ENOMEM;
+	config->override_creds = ovl_default_override_creds;
 
 	while ((p = ovl_next_opt(&opt)) != NULL) {
 		int token;
@@ -555,6 +568,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->xino = OVL_XINO_AUTO;
 			break;
 
+		case OPT_OVERRIDE_CREDS_ON:
+			config->override_creds = true;
+			break;
+
+		case OPT_OVERRIDE_CREDS_OFF:
+			config->override_creds = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 6f1078028c66..0a59de9b4088 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
 
+	if (!ofs->config.override_creds)
+		return NULL;
 	return override_creds(ofs->creator_cred);
 }
 
+void ovl_revert_creds(const struct cred *old_cred)
+{
+	if (old_cred)
+		revert_creds(old_cred);
+}
+
 struct super_block *ovl_same_sb(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -630,7 +638,7 @@ int ovl_nlink_start(struct dentry *dentry, bool *locked)
 	 * value relative to the upper inode nlink in an upper inode xattr.
 	 */
 	err = ovl_set_nlink_upper(dentry);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 out:
 	if (err)
@@ -650,7 +658,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked)
 
 			old_cred = ovl_override_creds(dentry->d_sb);
 			ovl_cleanup_index(dentry);
-			revert_creds(old_cred);
+			ovl_revert_creds(old_cred);
 		}
 
 		mutex_unlock(&OVL_I(d_inode(dentry))->lock);
-- 
2.18.0.rc2.346.g013aa6912e-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
  2018-06-22 17:16 [PATCH v4] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
@ 2018-06-23  6:46 ` Amir Goldstein
  2018-06-25 16:07   ` Mark Salyzyn
  2018-06-26 14:21   ` Vivek Goyal
  2018-06-25 12:38 ` Vivek Goyal
  1 sibling, 2 replies; 7+ messages in thread
From: Amir Goldstein @ 2018-06-23  6:46 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet, Vivek Goyal,
	Eric W . Biederman, Randy Dunlap, overlayfs, linux-doc

On Fri, Jun 22, 2018 at 8:16 PM, Mark Salyzyn <salyzyn@android.com> wrote:
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.
>
> If the principles of least privilege are applied, the mounter's
> credentials might not overlap the credentials of the caller's when
> accessing the overlayfs filesystem.  For example, a file that a lower
> DAC privileged caller can execute, is MAC denied to the generally
> higher DAC privileged mounter, to prevent an attack vector.
>
> We add the option to turn off override_creds in the mount options; all
> subsequent operations after mount on the filesystem will be only the
> caller's credentials.  This option default is set in the CONFIG
> OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
>
> The module boolean parameter and mount option override_creds is also
> added as a presence check for this "feature" by checking existence of
> /sys/module/overlay/parameters/overlay_creds.  This will allow user
> space to determine if the option can be supplied successfully to the
> mount(2) operation.
>
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
>
> ---
> v2:
> - Forward port changed attr to stat, resulting in a build error.
> - altered commit message.
>
> v3:
> - Change name from caller_credentials / creator_credentials to the
>   boolean override_creds.
> - Changed from creator to mounter credentials.
> - Updated and fortified the documentation.
> - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
>
> v4:
> - spelling and grammar errors in text
>
>  Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
>  fs/overlayfs/Kconfig                    | 20 ++++++++++++++++++++
>  fs/overlayfs/copy_up.c                  |  2 +-
>  fs/overlayfs/dir.c                      |  9 +++++----
>  fs/overlayfs/inode.c                    | 16 ++++++++--------
>  fs/overlayfs/namei.c                    |  6 +++---
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/readdir.c                  |  4 ++--
>  fs/overlayfs/super.c                    | 21 +++++++++++++++++++++
>  fs/overlayfs/util.c                     | 12 ++++++++++--
>  11 files changed, 89 insertions(+), 20 deletions(-)
>
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index 72615a2c0752..18e6d70ea4c9 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -106,6 +106,23 @@ Only the lists of names from directories are merged.  Other content
>  such as metadata and extended attributes are reported for the upper
>  directory only.  These attributes of the lower directory are hidden.
>
> +credentials
> +-----------
> +
> +By default, all access to the upper, lower and work directories is the
> +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> +checked against the caller's credentials.
> +
> +If the principles of least privilege are applied, the mounter's
> +credentials might not overlap the credentials of the caller's when
> +accessing the overlayfs filesystem.  For example, a file that a lower
> +DAC privileged caller can execute, is MAC denied to the generally
> +higher DAC privileged mounter, to prevent an attack vector.  One
> +option is to turn off override_creds in the mount options; all
> +subsequent operations after mount on the filesystem will be only the
> +caller's credentials.  This option default is set in the CONFIG
> +OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
> +

Mark,

Thanks for the properly documented patch, but this documentation it
missing the caveats of this config option and there are severe caveats
as was discussed on earlier version of the patch.

You should mention the not so minor detail that this option can result
in inability to delete files/directories from overlay and there me be other
side effects. This is one of those features that should be warning
unconditionally that user should really know what user is doing.

You did not address my concern that the test for setting trusted xattr
on mount (ovl_make_workdir) should emit a different kind of warning
when override_creds=off. In fact, I think it should emit a warning
when override_creds=off unconditionally to indicate that weird things
can be expected and we "really hope you know what you are doing".

A new security concern I just noticed - overlayfs calls some vfs
functions directly to perform operations that are typically not
allowed to unprivileged users without checking credentials.
In those cases your patch introduces a security vulnerability.

Examples:
- overlayfs calls exportfs_decode_fh() on underlying
fs without checking CAP_DAC_READ_SEARCH
- overlayfs calls vfs_whiteout() which calls underlying fs mknod
without checking CAP_MKNOD

Those examples could be easily fixed and you may righfully
claim that they are bugs, but the fact is that those "bugs" are
harmless until someone creates an irregular security model
without capabilities to mount, without capability to mknod.

What's worse is that you have to audit the overlayfs code and
find all these potential bugs and fix them before changing the
assumptions that were made over the years about mounter
credentials.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
  2018-06-22 17:16 [PATCH v4] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
  2018-06-23  6:46 ` Amir Goldstein
@ 2018-06-25 12:38 ` Vivek Goyal
  2018-06-25 15:00   ` Mark Salyzyn
  2018-08-10 16:55   ` Mark Salyzyn
  1 sibling, 2 replies; 7+ messages in thread
From: Vivek Goyal @ 2018-06-25 12:38 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap, linux-unionfs,
	linux-doc

On Fri, Jun 22, 2018 at 10:16:02AM -0700, Mark Salyzyn wrote:
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.
> 
> If the principles of least privilege are applied, the mounter's
> credentials might not overlap the credentials of the caller's when
> accessing the overlayfs filesystem.  For example, a file that a lower
> DAC privileged caller can execute, is MAC denied to the generally
> higher DAC privileged mounter, to prevent an attack vector.

Hi Mark,

I am wondering, what does it mean that caller is privileged enough to do
mknod and set trusted xattrs but it does not have privileges to do mount.
If caller is privileged, then it can do mount as well?

Or, what does it mean that a mounter can mount (hence providing access
to certain resources on the system) but then mounter itself does not
have access to those resources. If mounter does not have access to
those resources, then mounter should not be allowed to do the mount
and provide access to those resources to a third person?

For example, SELinux context= mount option. So here mounter can create
a mount point with label context=foo, and provide access to underlying
files/dirs to the caller. Now if mounter itself does not have access
to resources on which mount is being created, then how it is supposed
to provide that access to unprivileged caller?

Going by your analogy of init being attacked, then one simply have to
attack init and trick it to mount something with context=foo and gain
access to resources mounter itself could not access.

While my example is fully valid for disks, it is not fully valid for
overlay as we do two level of checks for many operations. So while overlay
inode level check will pass due to context=, underlying file system check
will fail. But this two level of checks does not happen outside overlay.
SELinux is not aware of stacking of filesystems so it could just do check
on overlay inode. So if a caller opens a file and passes file descriptor
to another process who is not supposed to access file, with context= mounts,
I think SELinux will allow access as second process is allowed to access
overlay inode.

IOW, if mounter is a separate process and if mounter itself can not
access a certain resource, then it should not allow other lower privileged
processes access to that resource. (Linux SELinux context= mounts). And
I am concerned that by taking away checks for mounter's creds later, how
do we ensure that privlege escalation did not happen by tricking mounter.

Thanks
Vivek

> 
> We add the option to turn off override_creds in the mount options; all
> subsequent operations after mount on the filesystem will be only the
> caller's credentials.  This option default is set in the CONFIG
> OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
> 
> The module boolean parameter and mount option override_creds is also
> added as a presence check for this "feature" by checking existence of
> /sys/module/overlay/parameters/overlay_creds.  This will allow user
> space to determine if the option can be supplied successfully to the
> mount(2) operation.
> 
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> ---
> v2:
> - Forward port changed attr to stat, resulting in a build error.
> - altered commit message.
> 
> v3:
> - Change name from caller_credentials / creator_credentials to the
>   boolean override_creds.
> - Changed from creator to mounter credentials.
> - Updated and fortified the documentation.
> - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
> 
> v4:
> - spelling and grammar errors in text
> 
>  Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
>  fs/overlayfs/Kconfig                    | 20 ++++++++++++++++++++
>  fs/overlayfs/copy_up.c                  |  2 +-
>  fs/overlayfs/dir.c                      |  9 +++++----
>  fs/overlayfs/inode.c                    | 16 ++++++++--------
>  fs/overlayfs/namei.c                    |  6 +++---
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/readdir.c                  |  4 ++--
>  fs/overlayfs/super.c                    | 21 +++++++++++++++++++++
>  fs/overlayfs/util.c                     | 12 ++++++++++--
>  11 files changed, 89 insertions(+), 20 deletions(-)
> 
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index 72615a2c0752..18e6d70ea4c9 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -106,6 +106,23 @@ Only the lists of names from directories are merged.  Other content
>  such as metadata and extended attributes are reported for the upper
>  directory only.  These attributes of the lower directory are hidden.
>  
> +credentials
> +-----------
> +
> +By default, all access to the upper, lower and work directories is the
> +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> +checked against the caller's credentials.
> +
> +If the principles of least privilege are applied, the mounter's
> +credentials might not overlap the credentials of the caller's when
> +accessing the overlayfs filesystem.  For example, a file that a lower
> +DAC privileged caller can execute, is MAC denied to the generally
> +higher DAC privileged mounter, to prevent an attack vector.  One
> +option is to turn off override_creds in the mount options; all
> +subsequent operations after mount on the filesystem will be only the
> +caller's credentials.  This option default is set in the CONFIG
> +OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
> +
>  whiteouts and opaque directories
>  --------------------------------
>  
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index 9384164253ac..d21dde046b8d 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -103,3 +103,23 @@ config OVERLAY_FS_XINO_AUTO
>  	  For more information, see Documentation/filesystems/overlayfs.txt
>  
>  	  If unsure, say N.
> +
> +config OVERLAY_FS_OVERRIDE_CREDS
> +	bool "Overlay filesystem override credentials"
> +	depends on OVERLAY_FS
> +	default y
> +	help
> +	  If set, all access to the upper, lower and work directories is the
> +	  recorded mounter's MAC and DAC credentials.  The incoming accesses
> +	  are checked against the caller's credentials.
> +
> +	  If the principles of least privilege are applied, the mounter's
> +	  credentials might not overlap the credentials of the caller's when
> +	  accessing the overlayfs filesystem.  The mount option
> +	  "override_creds=off" drops the mounter's credential check, so that
> +	  all subsequent operations, after mount, on the filesystem will only
> +	  be the caller's credentials.  This option sets the default for the
> +	  module option override_creds, and thus the default for all mounts
> +	  that do not specify this option.
> +
> +	  For more information see Documentation/filesystems/overlayfs.txt
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index ddaddb4ce4c3..7a841718ff2e 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -790,7 +790,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
>  		dput(parent);
>  		dput(next);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index f480b1a2cd2e..a9f10cd38e32 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -561,7 +561,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  		override_cred->fsgid = inode->i_gid;
>  		if (!attr->hardlink) {
>  			err = security_dentry_create_files_as(dentry,
> -					attr->mode, &dentry->d_name, old_cred,
> +					attr->mode, &dentry->d_name,
> +					old_cred ? old_cred : current_cred(),
>  					override_cred);
>  			if (err) {
>  				put_cred(override_cred);
> @@ -577,7 +578,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  			err = ovl_create_over_whiteout(dentry, inode, attr);
>  	}
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return err;
>  }
>  
> @@ -824,7 +825,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
>  		err = ovl_remove_upper(dentry, is_dir, &list);
>  	else
>  		err = ovl_remove_and_whiteout(dentry, &list);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (!err) {
>  		if (is_dir)
>  			clear_nlink(dentry->d_inode);
> @@ -1150,7 +1151,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>  out_unlock:
>  	unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	ovl_nlink_end(new, locked);
>  out_drop_write:
>  	ovl_drop_write(old);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index ed16a898caeb..afb0af1a24e9 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -49,7 +49,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  		inode_lock(upperdentry->d_inode);
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		err = notify_change(upperdentry, attr, NULL);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  		if (!err)
>  			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
>  		inode_unlock(upperdentry->d_inode);
> @@ -208,7 +208,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		stat->nlink = dentry->d_inode->i_nlink;
>  
>  out:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -242,7 +242,7 @@ int ovl_permission(struct inode *inode, int mask)
>  		mask |= MAY_READ;
>  	}
>  	err = inode_permission(realinode, mask);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -259,7 +259,7 @@ static const char *ovl_get_link(struct dentry *dentry,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	p = vfs_get_link(ovl_dentry_real(dentry), done);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return p;
>  }
>  
> @@ -302,7 +302,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>  		WARN_ON(flags != XATTR_REPLACE);
>  		err = vfs_removexattr(realdentry, name);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  out_drop_write:
>  	ovl_drop_write(dentry);
> @@ -320,7 +320,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_getxattr(realdentry, name, value, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return res;
>  }
>  
> @@ -344,7 +344,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_listxattr(realdentry, list, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (res <= 0 || size == 0)
>  		return res;
>  
> @@ -379,7 +379,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	acl = get_acl(realinode, type);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return acl;
>  }
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index c993dd8db739..c53e0b127332 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1024,7 +1024,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  		OVL_I(inode)->redirect = upperredirect;
>  	}
>  
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	dput(index);
>  	kfree(stack);
>  	kfree(d.redirect);
> @@ -1043,7 +1043,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  	kfree(upperredirect);
>  out:
>  	kfree(d.redirect);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return ERR_PTR(err);
>  }
>  
> @@ -1097,7 +1097,7 @@ bool ovl_lower_positive(struct dentry *dentry)
>  			dput(this);
>  		}
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return positive;
>  }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 7538b9b56237..81968e574264 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -195,6 +195,7 @@ int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
> +void ovl_revert_creds(const struct cred *oldcred);
>  struct super_block *ovl_same_sb(struct super_block *sb);
>  int ovl_can_decode_fh(struct super_block *sb);
>  struct dentry *ovl_indexdir(struct super_block *sb);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 41655a7d6894..ee4cc3802147 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -19,6 +19,7 @@ struct ovl_config {
>  	bool index;
>  	bool nfs_export;
>  	int xino;
> +	bool override_creds;
>  };
>  
>  struct ovl_sb {
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index ef1fe42ff7bb..150c7ee2f7f7 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
>  		}
>  		inode_unlock(dir->d_inode);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -906,7 +906,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	err = ovl_dir_read_merged(dentry, list, &root);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (err)
>  		return err;
>  
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 704b37311467..9f1e0cc85d27 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -56,6 +56,12 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_xino_auto_def,
>  		 "Auto enable xino feature");
>  
> +static bool __read_mostly ovl_default_override_creds =
> +	IS_ENABLED(CONFIG_OVERLAY_FS_OVERRIDE_CREDS);
> +module_param_named(override_creds, ovl_default_override_creds, bool, 0644);
> +MODULE_PARM_DESC(ovl_default_override_creds,
> +		 "Use mounter's credentials for accesses");
> +
>  static void ovl_entry_stack_free(struct ovl_entry *oe)
>  {
>  	unsigned int i;
> @@ -376,6 +382,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>  						"on" : "off");
>  	if (ofs->config.xino != ovl_xino_def())
>  		seq_printf(m, ",xino=%s", ovl_xino_str[ofs->config.xino]);
> +	seq_show_option(m, "override_creds",
> +			ofs->config.override_creds ? "on" : "off");
>  	return 0;
>  }
>  
> @@ -413,6 +421,8 @@ enum {
>  	OPT_XINO_ON,
>  	OPT_XINO_OFF,
>  	OPT_XINO_AUTO,
> +	OPT_OVERRIDE_CREDS_ON,
> +	OPT_OVERRIDE_CREDS_OFF,
>  	OPT_ERR,
>  };
>  
> @@ -429,6 +439,8 @@ static const match_table_t ovl_tokens = {
>  	{OPT_XINO_ON,			"xino=on"},
>  	{OPT_XINO_OFF,			"xino=off"},
>  	{OPT_XINO_AUTO,			"xino=auto"},
> +	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
> +	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
>  	{OPT_ERR,			NULL}
>  };
>  
> @@ -485,6 +497,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
>  	if (!config->redirect_mode)
>  		return -ENOMEM;
> +	config->override_creds = ovl_default_override_creds;
>  
>  	while ((p = ovl_next_opt(&opt)) != NULL) {
>  		int token;
> @@ -555,6 +568,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  			config->xino = OVL_XINO_AUTO;
>  			break;
>  
> +		case OPT_OVERRIDE_CREDS_ON:
> +			config->override_creds = true;
> +			break;
> +
> +		case OPT_OVERRIDE_CREDS_OFF:
> +			config->override_creds = false;
> +			break;
> +
>  		default:
>  			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>  			return -EINVAL;
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 6f1078028c66..0a59de9b4088 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
>  
> +	if (!ofs->config.override_creds)
> +		return NULL;
>  	return override_creds(ofs->creator_cred);
>  }
>  
> +void ovl_revert_creds(const struct cred *old_cred)
> +{
> +	if (old_cred)
> +		revert_creds(old_cred);
> +}
> +
>  struct super_block *ovl_same_sb(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
> @@ -630,7 +638,7 @@ int ovl_nlink_start(struct dentry *dentry, bool *locked)
>  	 * value relative to the upper inode nlink in an upper inode xattr.
>  	 */
>  	err = ovl_set_nlink_upper(dentry);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  out:
>  	if (err)
> @@ -650,7 +658,7 @@ void ovl_nlink_end(struct dentry *dentry, bool locked)
>  
>  			old_cred = ovl_override_creds(dentry->d_sb);
>  			ovl_cleanup_index(dentry);
> -			revert_creds(old_cred);
> +			ovl_revert_creds(old_cred);
>  		}
>  
>  		mutex_unlock(&OVL_I(d_inode(dentry))->lock);
> -- 
> 2.18.0.rc2.346.g013aa6912e-goog
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
  2018-06-25 12:38 ` Vivek Goyal
@ 2018-06-25 15:00   ` Mark Salyzyn
  2018-08-10 16:55   ` Mark Salyzyn
  1 sibling, 0 replies; 7+ messages in thread
From: Mark Salyzyn @ 2018-06-25 15:00 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap, linux-unionfs,
	linux-doc

NB: Amir has asked for me to itemize all the gotcha and security 
concerns of this feature.

On 06/25/2018 05:38 AM, Vivek Goyal wrote:
> On Fri, Jun 22, 2018 at 10:16:02AM -0700, Mark Salyzyn wrote:
>> By default, all access to the upper, lower and work directories is the
>> recorded mounter's MAC and DAC credentials.  The incoming accesses are
>> checked against the caller's credentials.
>>
>> If the principles of least privilege are applied, the mounter's
>> credentials might not overlap the credentials of the caller's when
>> accessing the overlayfs filesystem.  For example, a file that a lower
>> DAC privileged caller can execute, is MAC denied to the generally
>> higher DAC privileged mounter, to prevent an attack vector.
> Hi Mark,
>
> I am wondering, what does it mean that caller is privileged enough to do
> mknod and set trusted xattrs but it does not have privileges to do mount.
> If caller is privileged, then it can do mount as well?
There is only one process, with one set of MAC and DAC security policy, 
to perform the mounting. There are multitudes of callers that have 
individually different and non-overlapping MAC and DAC security 
policies. Some can mount, do mknod, even set xattr, some can not. mount, 
mknod and xattr in MAC (selinux) rules can individually be controlled. 
The issues encountered are that the mounter (init) does _not_ have 
access privileges uniformly to all the files represented by the 
filesystem, it can not create device nodes. Another caller (adb root) 
will have mknod and xattr privileges, and yet another caller 
(system_server for example) has execute privileges for libraries that 
the mounter (init) does not.
> Or, what does it mean that a mounter can mount (hence providing access
> to certain resources on the system) but then mounter itself does not
> have access to those resources. If mounter does not have access to
> those resources, then mounter should not be allowed to do the mount
> and provide access to those resources to a third person?
Every resource has a MAC (selinux) label in the file system. For 
example, If the mounter has no vested interest in the ability to create. 
read or execute the resources, than the mounter will not be granted 
those rights. The labels are granted a list of rights to a specific 
"third person", not just _any_ third person.
>
> For example, SELinux context= mount option. So here mounter can create
> a mount point with label context=foo, and provide access to underlying
> files/dirs to the caller. Now if mounter itself does not have access
> to resources on which mount is being created, then how it is supposed
> to provide that access to unprivileged caller?

Not using that option, can't use it, really part of a less security 
conscious policy. obviously does not work in Android's security model. 
If we did, and it had blanket rights to do so, the mounter could be 
granting "random third parties" rights to files that have some specific 
controlled contexts.

Are you telling me to use the options to grant a string privilege hole, 
I'd say context= is a far more serious security problem and we need to 
suppress it's use (!). It is meant to mount untrusted/removable disks, 
by labelling all contexts at a lowest point, for instance 
u:object_r:untrusted_file:s0 so that we get no surprises of a strong 
source context from the filesystem.
> Going by your analogy of init being attacked, then one simply have to
> attack init and trick it to mount something with context=foo and gain
> access to resources mounter itself could not access.
Yes, they could. Perhaps both our examples are part of Argumentii 
Absurdum in their simplicity; but alas in Android there are _two_ inits, 
one that has a limited access to _system_ resources, and another that 
has limited access to _vendor_ system resources, and the neither is 
supposed to have blanket rights to the other's resources. Their 
overlayfs mounts will reflect the privileges.
> While my example is fully valid for disks, it is not fully valid for
> overlay as we do two level of checks for many operations. So while overlay
> inode level check will pass due to context=, underlying file system check
> will fail. But this two level of checks does not happen outside overlay.
> SELinux is not aware of stacking of filesystems so it could just do check
> on overlay inode. So if a caller opens a file and passes file descriptor
> to another process who is not supposed to access file, with context= mounts,
> I think SELinux will allow access as second process is allowed to access
> overlay inode.
Again, if we use context= mounts, the file privileges will be low, as in 
all applications, save for a trusted few with careful control, are 
blocked from u:object_r:untrusted_file:s0. In android, when Fds are 
passed around, the privilege of the caller will protect the fd from 
being abused by a third party. Obviously this allows the open privileges 
of the first caller to be bypassed for the second, but it will clearly 
block based on the source and target contexts for the file resource and 
the second caller's access privileges.
>
> IOW, if mounter is a separate process and if mounter itself can not
> access a certain resource, then it should not allow other lower privileged
> processes access to that resource. (Linux SELinux context= mounts). And
> I am concerned that by taking away checks for mounter's creds later, how
> do we ensure that privlege escalation did not happen by tricking mounter.
Again, context= is never to be used lightly, it must be at an untrusted 
label set.

Yes, a (limited) attack can be mounted(sic) by setting the privileges of 
the labels to u:object_r:init_exec:s0, but as stated before, init is 
only granted access to target contexts that it needs (eg: it can not 
create devices nodes, in /dev or anywhere else, that is granted to 
u:object_r:ueventd_exec:s0). Of course, no one is allowed execute and 
context transition for the init_exec label, so this behaviour I speak of 
is locked down 300 ways. There is no _root_ that also has all the DAC 
capabilities or blanket MAC privileges.
>
> Thanks
> Vivek
Sincerley -- Mark Salyzyn

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
  2018-06-23  6:46 ` Amir Goldstein
@ 2018-06-25 16:07   ` Mark Salyzyn
  2018-06-26 14:21   ` Vivek Goyal
  1 sibling, 0 replies; 7+ messages in thread
From: Mark Salyzyn @ 2018-06-25 16:07 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet, Vivek Goyal,
	Eric W . Biederman, Randy Dunlap, overlayfs, linux-doc

On 06/22/2018 11:46 PM, Amir Goldstein wrote:
> Mark,
>
> Thanks for the properly documented patch, but this documentation it
> missing the caveats of this config option and there are severe caveats
> as was discussed on earlier version of the patch.
>
> You should mention the not so minor detail that this option can result
> in inability to delete files/directories from overlay and there me be other
> side effects. This is one of those features that should be warning
> unconditionally that user should really know what user is doing
Agreed, I would like to prevent it becoming a treatise ...

The upperdir tree should match the privileges of the lower tree, and in 
Android that is enforced by a hard-coded built-in map (fs_config for 
DAC, restorecon map for MAC) for the caller writers that never causes 
unexpected adjustments (famous last wurds). The active 
(writers/creators) callers have _more_ privileges than init 
(creator/mounter), and are only available on development (userdebug) 
builds. All else are passive (readers), and although less privileged 
than init, have demonstrable read MAC privs where init does not.
> You did not address my concern that the test for setting trusted xattr
> on mount (ovl_make_workdir) should emit a different kind of warning
> when override_creds=off. In fact, I think it should emit a warning
> when override_creds=off unconditionally to indicate that weird things
> can be expected and we "really hope you know what you are doing".
>
> A new security concern I just noticed - overlayfs calls some vfs
> functions directly to perform operations that are typically not
> allowed to unprivileged users without checking credentials.
> In those cases your patch introduces a security vulnerability.
>
> Examples:
> - overlayfs calls exportfs_decode_fh() on underlying
> fs without checking CAP_DAC_READ_SEARCH
> - overlayfs calls vfs_whiteout() which calls underlying fs mknod
> without checking CAP_MKNOD
>
> Those examples could be easily fixed and you may righfully
> claim that they are bugs, but the fact is that those "bugs" are
> harmless until someone creates an irregular security model
> without capabilities to mount, without capability to mknod.
>
> What's worse is that you have to audit the overlayfs code and
> find all these potential bugs and fix them before changing the
> assumptions that were made over the years about mounter
> credentials.
Thanks, _this_ is what a good review is all about. I will need a deeper 
dive (b/c I did not see these) into all the 'command paths' to determine 
any missed/assumed checks. In Android, all the 'caller' issues I have 
with the existing checks are passive (read), and I would _hate_ to be 
providing them (unchecked and assumed) DAC privileges. In Android, it is 
simpler, they would not pass the first barriers, to the internal assumed 
points in any case, but multilevel security _requires_ us to recheck. 
The active (create/write) callers are few and trusted, but _should_ be 
checked w/o assumption (eg: if 'adb push' is not granted CAP_MKNOD, it 
should be blocked).
> Thanks,
> Amir.

Sincerely -- Mark Salyzyn



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
  2018-06-23  6:46 ` Amir Goldstein
  2018-06-25 16:07   ` Mark Salyzyn
@ 2018-06-26 14:21   ` Vivek Goyal
  1 sibling, 0 replies; 7+ messages in thread
From: Vivek Goyal @ 2018-06-26 14:21 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Mark Salyzyn, linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Randy Dunlap, overlayfs, linux-doc

On Sat, Jun 23, 2018 at 09:46:07AM +0300, Amir Goldstein wrote:
> On Fri, Jun 22, 2018 at 8:16 PM, Mark Salyzyn <salyzyn@android.com> wrote:
> > By default, all access to the upper, lower and work directories is the
> > recorded mounter's MAC and DAC credentials.  The incoming accesses are
> > checked against the caller's credentials.
> >
> > If the principles of least privilege are applied, the mounter's
> > credentials might not overlap the credentials of the caller's when
> > accessing the overlayfs filesystem.  For example, a file that a lower
> > DAC privileged caller can execute, is MAC denied to the generally
> > higher DAC privileged mounter, to prevent an attack vector.
> >
> > We add the option to turn off override_creds in the mount options; all
> > subsequent operations after mount on the filesystem will be only the
> > caller's credentials.  This option default is set in the CONFIG
> > OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
> >
> > The module boolean parameter and mount option override_creds is also
> > added as a presence check for this "feature" by checking existence of
> > /sys/module/overlay/parameters/overlay_creds.  This will allow user
> > space to determine if the option can be supplied successfully to the
> > mount(2) operation.
> >
> > Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> > Cc: Miklos Szeredi <miklos@szeredi.hu>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Vivek Goyal <vgoyal@redhat.com>
> > Cc: Eric W. Biederman <ebiederm@xmission.com>
> > Cc: Amir Goldstein <amir73il@gmail.com>
> > Cc: Randy Dunlap <rdunlap@infradead.org>
> > Cc: linux-unionfs@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> >
> > ---
> > v2:
> > - Forward port changed attr to stat, resulting in a build error.
> > - altered commit message.
> >
> > v3:
> > - Change name from caller_credentials / creator_credentials to the
> >   boolean override_creds.
> > - Changed from creator to mounter credentials.
> > - Updated and fortified the documentation.
> > - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
> >
> > v4:
> > - spelling and grammar errors in text
> >
> >  Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
> >  fs/overlayfs/Kconfig                    | 20 ++++++++++++++++++++
> >  fs/overlayfs/copy_up.c                  |  2 +-
> >  fs/overlayfs/dir.c                      |  9 +++++----
> >  fs/overlayfs/inode.c                    | 16 ++++++++--------
> >  fs/overlayfs/namei.c                    |  6 +++---
> >  fs/overlayfs/overlayfs.h                |  1 +
> >  fs/overlayfs/ovl_entry.h                |  1 +
> >  fs/overlayfs/readdir.c                  |  4 ++--
> >  fs/overlayfs/super.c                    | 21 +++++++++++++++++++++
> >  fs/overlayfs/util.c                     | 12 ++++++++++--
> >  11 files changed, 89 insertions(+), 20 deletions(-)
> >
> > diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> > index 72615a2c0752..18e6d70ea4c9 100644
> > --- a/Documentation/filesystems/overlayfs.txt
> > +++ b/Documentation/filesystems/overlayfs.txt
> > @@ -106,6 +106,23 @@ Only the lists of names from directories are merged.  Other content
> >  such as metadata and extended attributes are reported for the upper
> >  directory only.  These attributes of the lower directory are hidden.
> >
> > +credentials
> > +-----------
> > +
> > +By default, all access to the upper, lower and work directories is the
> > +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> > +checked against the caller's credentials.
> > +
> > +If the principles of least privilege are applied, the mounter's
> > +credentials might not overlap the credentials of the caller's when
> > +accessing the overlayfs filesystem.  For example, a file that a lower
> > +DAC privileged caller can execute, is MAC denied to the generally
> > +higher DAC privileged mounter, to prevent an attack vector.  One
> > +option is to turn off override_creds in the mount options; all
> > +subsequent operations after mount on the filesystem will be only the
> > +caller's credentials.  This option default is set in the CONFIG
> > +OVERLAY_FS_OVERRIDE_CREDS or in the module option override_creds.
> > +
> 
> Mark,
> 
> Thanks for the properly documented patch, but this documentation it
> missing the caveats of this config option and there are severe caveats
> as was discussed on earlier version of the patch.
> 
> You should mention the not so minor detail that this option can result
> in inability to delete files/directories from overlay and there me be other
> side effects. This is one of those features that should be warning
> unconditionally that user should really know what user is doing.
> 
> You did not address my concern that the test for setting trusted xattr
> on mount (ovl_make_workdir) should emit a different kind of warning
> when override_creds=off. In fact, I think it should emit a warning
> when override_creds=off unconditionally to indicate that weird things
> can be expected and we "really hope you know what you are doing".
> 
> A new security concern I just noticed - overlayfs calls some vfs
> functions directly to perform operations that are typically not
> allowed to unprivileged users without checking credentials.
> In those cases your patch introduces a security vulnerability.
> 
> Examples:
> - overlayfs calls exportfs_decode_fh() on underlying
> fs without checking CAP_DAC_READ_SEARCH
> - overlayfs calls vfs_whiteout() which calls underlying fs mknod
> without checking CAP_MKNOD
> 

This reminds me of another potential issue we discussed in the past.

That is lookup() permissions inside a directory on lower and upper could
be different. That is a process might be allowed to search in upper but
not necessarily in lower and that lead to conflicts w.r.t what should be
the semantics. Given overlay is providing merged directory view,
should caller still be able to search in lower dir.

https://lkml.org/lkml/2016/2/24/541

I think initial approach was to create a variant where overlay ignored
search permission checks on lower dir.

commit 38b78a5f18584db6fa7441e0f4531b283b0e6725
Author: Miklos Szeredi <mszeredi@redhat.com>
Date:   Wed May 11 01:16:37 2016 +0200

    ovl: ignore permissions on underlying lookup

And later it we went back to using lookup_one_one() and this time we
swithced to mounter's creds. So idea was that as long as mounter is
allowed to search, caller gets to search in lower dir.

commit c1b2cc1a765aff4df7b22abe6b66014236f73eba
Author: Miklos Szeredi <mszeredi@redhat.com>
Date:   Fri Jul 29 12:05:22 2016 +0200

    ovl: check mounter creds on underlying lookup


I think with this patch set, this issue will resurface. Caller might have
permission to search in upper and not in lower.

Thanks
Vivek


> Those examples could be easily fixed and you may righfully
> claim that they are bugs, but the fact is that those "bugs" are
> harmless until someone creates an irregular security model
> without capabilities to mount, without capability to mknod.
> 
> What's worse is that you have to audit the overlayfs code and
> find all these potential bugs and fix them before changing the
> assumptions that were made over the years about mounter
> credentials.
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] overlayfs: override_creds=off option bypass creator_cred
  2018-06-25 12:38 ` Vivek Goyal
  2018-06-25 15:00   ` Mark Salyzyn
@ 2018-08-10 16:55   ` Mark Salyzyn
  1 sibling, 0 replies; 7+ messages in thread
From: Mark Salyzyn @ 2018-08-10 16:55 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap, linux-unionfs,
	linux-doc

Sorry for taking so long to respond, spent a month on holiday and have 
just caught back up to my task list ;-}

Thanks for your comments!

Summary:

1) Add more caveats as discussed here to documentation and possibly to 
Kconfig
2) This option provides the ability to mount a resource, that the 
mounter does not have privileges for, but a third party does.
3) Ask if we should check CAP_SYS_PACCT (b/c it references smb mounts, 
making the caller actually read documentation, however there may be a 
better candidate capability) of the mounter to be able to mount with 
this new option? MAC (sepolicy) checks added (sez that without knowing 
the impact on the code or policy encoding)?

There is some repetition of the points below ...

On 06/25/2018 05:38 AM, Vivek Goyal wrote:
> On Fri, Jun 22, 2018 at 10:16:02AM -0700, Mark Salyzyn wrote:
>> By default, all access to the upper, lower and work directories is the
>> recorded mounter's MAC and DAC credentials.  The incoming accesses are
>> checked against the caller's credentials.
>>
>> If the principles of least privilege are applied, the mounter's
>> credentials might not overlap the credentials of the caller's when
>> accessing the overlayfs filesystem.  For example, a file that a lower
>> DAC privileged caller can execute, is MAC denied to the generally
>> higher DAC privileged mounter, to prevent an attack vector.
> Hi Mark,
>
> I am wondering, what does it mean that caller is privileged enough to do
> mknod and set trusted xattrs but it does not have privileges to do mount.
init has sepolicy privileges to mount, caller accessing the file does 
not have sepolicy privileges to mount. init does not have privileges to 
make devices nodes (hypothetical), ueventd does.
> If caller is privileged, then it can do mount as well?

If we granted it to, but we do not. Only adb (userdebug builds), init 
and vold can do mounts.
> Or, what does it mean that a mounter can mount (hence providing access
> to certain resources on the system) but then mounter itself does not
> have access to those resources. If mounter does not have access to
> those resources, then mounter should not be allowed to do the mount
> and provide access to those resources to a third person?
True pedantically, but that is _exactly_ why we need this option given 
that privileges are not nested and do not overlap, and we want to allow 
this. The option is basically _permission_ to mount, and provide access 
to the resources to a third person with a different privilege profile.

Do you propose we create a new MAC/DAC surrounding being able to apply 
this mount option? Should we check a capability? (eg: CAP_SYS_PACCT 
which permits mount/umount on new smb connections, or CAP_SYS_ADMIN, but 
that could be dangerous privilege in the override_creds=on case ...
> For example, SELinux context= mount option. So here mounter can create
> a mount point with label context=foo, and provide access to underlying
> files/dirs to the caller. Now if mounter itself does not have access
> to resources on which mount is being created, then how it is supposed
> to provide that access to unprivileged caller?
mount has privilege on the resource (because it has system_filesystem 
label at the top), but a file underneath might have a different label?
> Going by your analogy of init being attacked, then one simply have to
> attack init and trick it to mount something with context=foo and gain
> access to resources mounter itself could not access.

All mounters represent an attack surface, so we _limit_ their 
capabilities to absolute minimum needed, and thus why we have 
non-overlapping access credentials between the mounter, and those that 
use the resources underneath the mount. And overlayfs is unique as a big 
gaping security hole if we all anyone other than critical system 
components to mount. More reasons for non-overlapping MAC.

> While my example is fully valid for disks, it is not fully valid for
> overlay as we do two level of checks for many operations. So while overlay
> inode level check will pass due to context=, underlying file system check
> will fail. But this two level of checks does not happen outside overlay.
> SELinux is not aware of stacking of filesystems so it could just do check
> on overlay inode. So if a caller opens a file and passes file descriptor
> to another process who is not supposed to access file, with context= mounts,
> I think SELinux will allow access as second process is allowed to access
> overlay inode.
Alas, in our specific Android case, where we are allowing readonly 
system partitions to be overlayed (as a merged set of contents from two 
system partitions to solve library search issues, or on userdebug 
developer builds where we actually allow workdir to replace contents), 
the _same_ security hole exists with overlay or not when a process (eg, 
via binder or unix doman socket) passes a file descriptor or inode 
reference around. Access to that reference still go through sepolicy 
checking (eg: read or write check MAC for target and source acceptable 
labels and paths).

In the generic case, it concerns me, albeit I must admit not fully 
understanding this attack. I have only pretended this option solves a 
non-overlapping MAC/DAC issue with caveats. For instance, do I need to 
add a longer list of caveats in the Kconfig or Documentation so the 
users are careful, or are there options that either need to be enforced 
orthogonal, or additional caps (as noted above) that need to be checked 
if the mounter is adding this option?
> IOW, if mounter is a separate process and if mounter itself can not
> access a certain resource, then it should not allow other lower privileged
> processes access to that resource. (Linux SELinux context= mounts). And
> I am concerned that by taking away checks for mounter's creds later, how
> do we ensure that privlege escalation did not happen by tricking mounter.

I do not propose to remove mounter's cred check, only allow the option 
to, the default is to use mounter's creds (except if the Kconfig or 
module options override the default).

On 06/26/2018 07:21 AM, Vivek Goyal wrote:
> On Sat, Jun 23, 2018 at 09:46:07AM +0300, Amir Goldstein wrote:
>> Mark,
>>
>> Thanks for the properly documented patch, but this documentation it
>> missing the caveats of this config option and there are severe caveats
>> as was discussed on earlier version of the patch.
>>
>> You should mention the not so minor detail that this option can result
>> in inability to delete files/directories from overlay and there me be other
>> side effects. This is one of those features that should be warning
>> unconditionally that user should really know what user is doing.
>>
>> You did not address my concern that the test for setting trusted xattr
>> on mount (ovl_make_workdir) should emit a different kind of warning
>> when override_creds=off. In fact, I think it should emit a warning
>> when override_creds=off unconditionally to indicate that weird things
>> can be expected and we "really hope you know what you are doing".
>>
>> A new security concern I just noticed - overlayfs calls some vfs
>> functions directly to perform operations that are typically not
>> allowed to unprivileged users without checking credentials.
>> In those cases your patch introduces a security vulnerability.
>>
>> Examples:
>> - overlayfs calls exportfs_decode_fh() on underlying
>> fs without checking CAP_DAC_READ_SEARCH
>> - overlayfs calls vfs_whiteout() which calls underlying fs mknod
>> without checking CAP_MKNOD
>>
I will have to work on that.
> This reminds me of another potential issue we discussed in the past.
>
> That is lookup() permissions inside a directory on lower and upper could
> be different. That is a process might be allowed to search in upper but
> not necessarily in lower and that lead to conflicts w.r.t what should be
> the semantics. Given overlay is providing merged directory view,
> should caller still be able to search in lower dir.
>
> https://lkml.org/lkml/2016/2/24/541
>
> I think initial approach was to create a variant where overlay ignored
> search permission checks on lower dir.
>
> commit 38b78a5f18584db6fa7441e0f4531b283b0e6725
> Author: Miklos Szeredi <mszeredi@redhat.com>
> Date:   Wed May 11 01:16:37 2016 +0200
>
>      ovl: ignore permissions on underlying lookup
>
> And later it we went back to using lookup_one_one() and this time we
> swithced to mounter's creds. So idea was that as long as mounter is
> allowed to search, caller gets to search in lower dir.
>
> commit c1b2cc1a765aff4df7b22abe6b66014236f73eba
> Author: Miklos Szeredi <mszeredi@redhat.com>
> Date:   Fri Jul 29 12:05:22 2016 +0200
>
>      ovl: check mounter creds on underlying lookup
>
>
> I think with this patch set, this issue will resurface. Caller might have
> permission to search in upper and not in lower.
I would hope that if a file is created in upper, that the directory has 
exactly the same priv's as lower, thus the restriction to search is the 
same. Yes, a privileged called could come in later and permit search and 
thus create a situation where updated content can be searched, but not 
static content in the lower. I believe this to be acceptable(tm)?


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-08-10 16:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-22 17:16 [PATCH v4] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
2018-06-23  6:46 ` Amir Goldstein
2018-06-25 16:07   ` Mark Salyzyn
2018-06-26 14:21   ` Vivek Goyal
2018-06-25 12:38 ` Vivek Goyal
2018-06-25 15:00   ` Mark Salyzyn
2018-08-10 16:55   ` Mark Salyzyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).