linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 1/2] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh
@ 2018-11-06 23:01 Mark Salyzyn
  2018-11-06 23:01 ` [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Salyzyn @ 2018-11-06 23:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Salyzyn, Miklos Szeredi, Jonathan Corbet, Vivek Goyal,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, kernel-team

Assumption never checked, should fail if the mounter creds are not
sufficient.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v8:
- rebase

v7:
- This time for realz

v6:
- rebase

v5:
- dependency of "overlayfs: override_creds=off option bypass creator_cred"

 fs/overlayfs/namei.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index efd372312ef1..3ac9dc8f6cc0 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -163,6 +163,9 @@ struct dentry *ovl_decode_real_fh(struct ovl_fh *fh, struct vfsmount *mnt,
 	if (!uuid_equal(&fh->uuid, &mnt->mnt_sb->s_uuid))
 		return NULL;
 
+	if (!capable(CAP_DAC_READ_SEARCH))
+		return ERR_PTR(-EPERM);
+
 	bytes = (fh->len - offsetof(struct ovl_fh, fid));
 	real = exportfs_decode_fh(mnt, (struct fid *)fh->fid,
 				  bytes >> 2, (int)fh->type,
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-06 23:01 [PATCH v8 1/2] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh Mark Salyzyn
@ 2018-11-06 23:01 ` Mark Salyzyn
  2018-11-08 15:56   ` Vivek Goyal
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mark Salyzyn @ 2018-11-06 23:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Salyzyn, Miklos Szeredi, Jonathan Corbet, Vivek Goyal,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, kernel-team

By default, all access to the upper, lower and work directories is the
recorded mounter's MAC and DAC credentials.  The incoming accesses are
checked against the caller's credentials.

If the principles of least privilege are applied, the mounter's
credentials might not overlap the credentials of the caller's when
accessing the overlayfs filesystem.  For example, a file that a lower
DAC privileged caller can execute, is MAC denied to the generally
higher DAC privileged mounter, to prevent an attack vector.

We add the option to turn off override_creds in the mount options; all
subsequent operations after mount on the filesystem will be only the
caller's credentials.  The module boolean parameter and mount option
override_creds is also added as a presence check for this "feature",
existence of /sys/module/overlay/parameters/override_creds.

It was not always this way.  Circa 4.4 there was no recorded mounter's
credentials, instead privileged access to upper or work directories
were temporarily increased to perform the operations.  The MAC
(selinux) policies were caller's in all cases.  override_creds=off
partially returns us to this older access model minus the insecure
temporary credential increases.  This is to permit use in a system
with non-overlapping security models for each executable including
the agent that mounts the overlayfs filesystem.  In Android
this is the case since init, which performs the mount operations,
has a minimal MAC set of privileges to reduce any attack surface,
and services that use the content have a different set of MAC
privileges (eg: read, for vendor labelled configuration, execute for
vendor libraries and modules).

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v8:
- drop pr_warn message after straw poll to remove it.
- added a use case in the commit message

v7:
- change name of internal parameter to ovl_override_creds_def
- report override_creds only if different than default

v6:
- Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
- Do better with the documentation.
- pr_warn message adjusted to report consequences.

v5:
- beefed up the caveats in the Documentation
- Is dependent on
  "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
  "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
- Added prwarn when override_creds=off

v4:
- spelling and grammar errors in text

v3:
- Change name from caller_credentials / creator_credentials to the
  boolean override_creds.
- Changed from creator to mounter credentials.
- Updated and fortified the documentation.
- Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS

v2:
- Forward port changed attr to stat, resulting in a build error.
- altered commit message.

 Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
 fs/overlayfs/copy_up.c                  |  2 +-
 fs/overlayfs/dir.c                      |  9 +++++----
 fs/overlayfs/inode.c                    | 16 ++++++++--------
 fs/overlayfs/namei.c                    |  6 +++---
 fs/overlayfs/overlayfs.h                |  1 +
 fs/overlayfs/ovl_entry.h                |  1 +
 fs/overlayfs/readdir.c                  |  4 ++--
 fs/overlayfs/super.c                    | 22 +++++++++++++++++++++-
 fs/overlayfs/util.c                     | 12 ++++++++++--
 10 files changed, 69 insertions(+), 21 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index eef7d9d259e8..5cc299df4436 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -102,6 +102,23 @@ Only the lists of names from directories are merged.  Other content
 such as metadata and extended attributes are reported for the upper
 directory only.  These attributes of the lower directory are hidden.
 
+credentials
+-----------
+
+By default, all access to the upper, lower and work directories is the
+recorded mounter's MAC and DAC credentials.  The incoming accesses are
+checked against the caller's credentials.
+
+override_creds mount flag turned off is reserved for when mounter and
+caller MAC or DAC credentials do not overlap.  Several unintended side
+effects will occur.  The caller with a lower privilege will not be
+able to delete files or directories, create nodes, or search some
+directories.  The caller with higher privilege can perform unexpected
+or unsecured operations.  The uneven security model where upperdir
+and workdir are opened at privilege, but accessed without, should only
+be used with strict understanding of the side effects and of the
+security policies.
+
 whiteouts and opaque directories
 --------------------------------
 
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 9e62dcf06fc4..dfab62ce7504 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -860,7 +860,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		dput(parent);
 		dput(next);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index c6289147c787..b7052e23c467 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -566,7 +566,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 		override_cred->fsgid = inode->i_gid;
 		if (!attr->hardlink) {
 			err = security_dentry_create_files_as(dentry,
-					attr->mode, &dentry->d_name, old_cred,
+					attr->mode, &dentry->d_name,
+					old_cred ? old_cred : current_cred(),
 					override_cred);
 			if (err) {
 				put_cred(override_cred);
@@ -582,7 +583,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 			err = ovl_create_over_whiteout(dentry, inode, attr);
 	}
 out_revert_creds:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return err;
 }
 
@@ -842,7 +843,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 		err = ovl_remove_upper(dentry, is_dir, &list);
 	else
 		err = ovl_remove_and_whiteout(dentry, &list);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (!err) {
 		if (is_dir)
 			clear_nlink(dentry->d_inode);
@@ -1212,7 +1213,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 out_unlock:
 	unlock_rename(new_upperdir, old_upperdir);
 out_revert_creds:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (update_nlink)
 		ovl_nlink_end(new);
 out_drop_write:
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 6bcc9dedc342..192f5508ed45 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -64,7 +64,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		inode_lock(upperdentry->d_inode);
 		old_cred = ovl_override_creds(dentry->d_sb);
 		err = notify_change(upperdentry, attr, NULL);
-		revert_creds(old_cred);
+		ovl_revert_creds(old_cred);
 		if (!err)
 			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
 		inode_unlock(upperdentry->d_inode);
@@ -260,7 +260,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		stat->nlink = dentry->d_inode->i_nlink;
 
 out:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -303,7 +303,7 @@ int ovl_permission(struct inode *inode, int mask)
 
 	old_cred = ovl_override_creds(inode->i_sb);
 	err = inode_permission(realinode, mask);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -320,7 +320,7 @@ static const char *ovl_get_link(struct dentry *dentry,
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	p = vfs_get_link(ovl_dentry_real(dentry), done);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return p;
 }
 
@@ -363,7 +363,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
 		WARN_ON(flags != XATTR_REPLACE);
 		err = vfs_removexattr(realdentry, name);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	/* copy c/mtime */
 	ovl_copyattr(d_inode(realdentry), inode);
@@ -384,7 +384,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	res = vfs_getxattr(realdentry, name, value, size);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return res;
 }
 
@@ -408,7 +408,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	res = vfs_listxattr(realdentry, list, size);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (res <= 0 || size == 0)
 		return res;
 
@@ -443,7 +443,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 
 	old_cred = ovl_override_creds(inode->i_sb);
 	acl = get_acl(realinode, type);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return acl;
 }
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 3ac9dc8f6cc0..c32fa8ed72e6 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1072,7 +1072,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			goto out_free_oe;
 	}
 
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (origin_path) {
 		dput(origin_path->dentry);
 		kfree(origin_path);
@@ -1099,7 +1099,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	kfree(upperredirect);
 out:
 	kfree(d.redirect);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return ERR_PTR(err);
 }
 
@@ -1153,7 +1153,7 @@ bool ovl_lower_positive(struct dentry *dentry)
 			dput(this);
 		}
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return positive;
 }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 5e45cb3630a0..6f8b6f9ff357 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -208,6 +208,7 @@ int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
+void ovl_revert_creds(const struct cred *oldcred);
 struct super_block *ovl_same_sb(struct super_block *sb);
 int ovl_can_decode_fh(struct super_block *sb);
 struct dentry *ovl_indexdir(struct super_block *sb);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index ec237035333a..e38eea8104be 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -20,6 +20,7 @@ struct ovl_config {
 	bool nfs_export;
 	int xino;
 	bool metacopy;
+	bool override_creds;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index cc8303a806b4..ec591b49e902 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
 		}
 		inode_unlock(dir->d_inode);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -921,7 +921,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	err = ovl_dir_read_merged(dentry, list, &root);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (err)
 		return err;
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 0116735cc321..933829ca7d7d 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -56,6 +56,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
 MODULE_PARM_DESC(ovl_xino_auto_def,
 		 "Auto enable xino feature");
 
+static bool __read_mostly ovl_override_creds_def = true;
+module_param_named(override_creds, ovl_override_creds_def, bool, 0644);
+MODULE_PARM_DESC(ovl_override_creds_def,
+		 "Use mounter's credentials for accesses");
+
 static void ovl_entry_stack_free(struct ovl_entry *oe)
 {
 	unsigned int i;
@@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.metacopy != ovl_metacopy_def)
 		seq_printf(m, ",metacopy=%s",
 			   ofs->config.metacopy ? "on" : "off");
+	if (ofs->config.override_creds != ovl_override_creds_def)
+		seq_show_option(m, "override_creds",
+				ofs->config.override_creds ? "on" : "off");
 	return 0;
 }
 
@@ -401,6 +409,8 @@ enum {
 	OPT_XINO_AUTO,
 	OPT_METACOPY_ON,
 	OPT_METACOPY_OFF,
+	OPT_OVERRIDE_CREDS_ON,
+	OPT_OVERRIDE_CREDS_OFF,
 	OPT_ERR,
 };
 
@@ -419,6 +429,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_AUTO,			"xino=auto"},
 	{OPT_METACOPY_ON,		"metacopy=on"},
 	{OPT_METACOPY_OFF,		"metacopy=off"},
+	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
+	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -477,6 +489,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
 	if (!config->redirect_mode)
 		return -ENOMEM;
+	config->override_creds = ovl_override_creds_def;
 
 	while ((p = ovl_next_opt(&opt)) != NULL) {
 		int token;
@@ -557,6 +570,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->metacopy = false;
 			break;
 
+		case OPT_OVERRIDE_CREDS_ON:
+			config->override_creds = true;
+			break;
+
+		case OPT_OVERRIDE_CREDS_OFF:
+			config->override_creds = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -1549,7 +1570,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
-
 	return 0;
 
 out_free_oe:
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 7c01327b1852..484d7f76ac9c 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
 
+	if (!ofs->config.override_creds)
+		return NULL;
 	return override_creds(ofs->creator_cred);
 }
 
+void ovl_revert_creds(const struct cred *old_cred)
+{
+	if (old_cred)
+		revert_creds(old_cred);
+}
+
 struct super_block *ovl_same_sb(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -782,7 +790,7 @@ int ovl_nlink_start(struct dentry *dentry)
 	 * value relative to the upper inode nlink in an upper inode xattr.
 	 */
 	err = ovl_set_nlink_upper(dentry);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 out:
 	if (err)
@@ -800,7 +808,7 @@ void ovl_nlink_end(struct dentry *dentry)
 
 		old_cred = ovl_override_creds(dentry->d_sb);
 		ovl_cleanup_index(dentry);
-		revert_creds(old_cred);
+		ovl_revert_creds(old_cred);
 	}
 
 	ovl_inode_unlock(inode);
-- 
2.19.1.930.g4563a0d9d0-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-06 23:01 ` [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
@ 2018-11-08 15:56   ` Vivek Goyal
  2018-11-08 20:01   ` Vivek Goyal
  2018-11-14 22:00   ` Vivek Goyal
  2 siblings, 0 replies; 10+ messages in thread
From: Vivek Goyal @ 2018-11-08 15:56 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, kernel-team

On Tue, Nov 06, 2018 at 03:01:15PM -0800, Mark Salyzyn wrote:
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.
> 
> If the principles of least privilege are applied, the mounter's
> credentials might not overlap the credentials of the caller's when
> accessing the overlayfs filesystem.  For example, a file that a lower
> DAC privileged caller can execute, is MAC denied to the generally
> higher DAC privileged mounter, to prevent an attack vector.
> 
> We add the option to turn off override_creds in the mount options; all
> subsequent operations after mount on the filesystem will be only the
> caller's credentials.  The module boolean parameter and mount option
> override_creds is also added as a presence check for this "feature",
> existence of /sys/module/overlay/parameters/override_creds.
> 
> It was not always this way.  Circa 4.4 there was no recorded mounter's
> credentials, instead privileged access to upper or work directories
> were temporarily increased to perform the operations.  The MAC
> (selinux) policies were caller's in all cases.  override_creds=off
> partially returns us to this older access model minus the insecure
> temporary credential increases.


So what will have to bunch of overlay operations now which require
priviliges or additional capabilities. To make all that succeed,
you will have to give those capabilities to processes accessing
overlayfs. If yes, I am wondering how is that more secure. IOW,
having to give these capabilities to all processes accessing
fs as opposed to giving those capabilities to only mounter.

You also mentioned that you want init to be least priviliged as
it could be attacked. But now all the processses are priviliged
and these could be attacked as well.

The way I am looking at it as follows.

- With current model, only mounter needs to have capabilities to
  do some operations such as whiteout creation etc.

- With override_creds=off, all services accessing overlayfs needs
  to have increased priviliges.

Is that not increasing attack surface. Previously, you will have to
attach init and now you can get away by breaking one of the services.

/me is wondering how does that increase security.

Thanks
Vivek


> This is to permit use in a system
> with non-overlapping security models for each executable including
> the agent that mounts the overlayfs filesystem.  In Android
> this is the case since init, which performs the mount operations,
> has a minimal MAC set of privileges to reduce any attack surface,
> and services that use the content have a different set of MAC
> privileges (eg: read, for vendor labelled configuration, execute for
> vendor libraries and modules).
> 
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@android.com
> ---
> v8:
> - drop pr_warn message after straw poll to remove it.
> - added a use case in the commit message
> 
> v7:
> - change name of internal parameter to ovl_override_creds_def
> - report override_creds only if different than default
> 
> v6:
> - Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
> - Do better with the documentation.
> - pr_warn message adjusted to report consequences.
> 
> v5:
> - beefed up the caveats in the Documentation
> - Is dependent on
>   "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
>   "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
> - Added prwarn when override_creds=off
> 
> v4:
> - spelling and grammar errors in text
> 
> v3:
> - Change name from caller_credentials / creator_credentials to the
>   boolean override_creds.
> - Changed from creator to mounter credentials.
> - Updated and fortified the documentation.
> - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
> 
> v2:
> - Forward port changed attr to stat, resulting in a build error.
> - altered commit message.
> 
>  Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
>  fs/overlayfs/copy_up.c                  |  2 +-
>  fs/overlayfs/dir.c                      |  9 +++++----
>  fs/overlayfs/inode.c                    | 16 ++++++++--------
>  fs/overlayfs/namei.c                    |  6 +++---
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/readdir.c                  |  4 ++--
>  fs/overlayfs/super.c                    | 22 +++++++++++++++++++++-
>  fs/overlayfs/util.c                     | 12 ++++++++++--
>  10 files changed, 69 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index eef7d9d259e8..5cc299df4436 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -102,6 +102,23 @@ Only the lists of names from directories are merged.  Other content
>  such as metadata and extended attributes are reported for the upper
>  directory only.  These attributes of the lower directory are hidden.
>  
> +credentials
> +-----------
> +
> +By default, all access to the upper, lower and work directories is the
> +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> +checked against the caller's credentials.
> +
> +override_creds mount flag turned off is reserved for when mounter and
> +caller MAC or DAC credentials do not overlap.  Several unintended side
> +effects will occur.  The caller with a lower privilege will not be
> +able to delete files or directories, create nodes, or search some
> +directories.  The caller with higher privilege can perform unexpected
> +or unsecured operations.  The uneven security model where upperdir
> +and workdir are opened at privilege, but accessed without, should only
> +be used with strict understanding of the side effects and of the
> +security policies.
> +
>  whiteouts and opaque directories
>  --------------------------------
>  
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 9e62dcf06fc4..dfab62ce7504 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -860,7 +860,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
>  		dput(parent);
>  		dput(next);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index c6289147c787..b7052e23c467 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -566,7 +566,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  		override_cred->fsgid = inode->i_gid;
>  		if (!attr->hardlink) {
>  			err = security_dentry_create_files_as(dentry,
> -					attr->mode, &dentry->d_name, old_cred,
> +					attr->mode, &dentry->d_name,
> +					old_cred ? old_cred : current_cred(),
>  					override_cred);
>  			if (err) {
>  				put_cred(override_cred);
> @@ -582,7 +583,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  			err = ovl_create_over_whiteout(dentry, inode, attr);
>  	}
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return err;
>  }
>  
> @@ -842,7 +843,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
>  		err = ovl_remove_upper(dentry, is_dir, &list);
>  	else
>  		err = ovl_remove_and_whiteout(dentry, &list);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (!err) {
>  		if (is_dir)
>  			clear_nlink(dentry->d_inode);
> @@ -1212,7 +1213,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>  out_unlock:
>  	unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (update_nlink)
>  		ovl_nlink_end(new);
>  out_drop_write:
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 6bcc9dedc342..192f5508ed45 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -64,7 +64,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  		inode_lock(upperdentry->d_inode);
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		err = notify_change(upperdentry, attr, NULL);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  		if (!err)
>  			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
>  		inode_unlock(upperdentry->d_inode);
> @@ -260,7 +260,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		stat->nlink = dentry->d_inode->i_nlink;
>  
>  out:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -303,7 +303,7 @@ int ovl_permission(struct inode *inode, int mask)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	err = inode_permission(realinode, mask);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -320,7 +320,7 @@ static const char *ovl_get_link(struct dentry *dentry,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	p = vfs_get_link(ovl_dentry_real(dentry), done);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return p;
>  }
>  
> @@ -363,7 +363,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>  		WARN_ON(flags != XATTR_REPLACE);
>  		err = vfs_removexattr(realdentry, name);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	/* copy c/mtime */
>  	ovl_copyattr(d_inode(realdentry), inode);
> @@ -384,7 +384,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_getxattr(realdentry, name, value, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return res;
>  }
>  
> @@ -408,7 +408,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_listxattr(realdentry, list, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (res <= 0 || size == 0)
>  		return res;
>  
> @@ -443,7 +443,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	acl = get_acl(realinode, type);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return acl;
>  }
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 3ac9dc8f6cc0..c32fa8ed72e6 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1072,7 +1072,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			goto out_free_oe;
>  	}
>  
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (origin_path) {
>  		dput(origin_path->dentry);
>  		kfree(origin_path);
> @@ -1099,7 +1099,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  	kfree(upperredirect);
>  out:
>  	kfree(d.redirect);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return ERR_PTR(err);
>  }
>  
> @@ -1153,7 +1153,7 @@ bool ovl_lower_positive(struct dentry *dentry)
>  			dput(this);
>  		}
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return positive;
>  }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 5e45cb3630a0..6f8b6f9ff357 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -208,6 +208,7 @@ int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
> +void ovl_revert_creds(const struct cred *oldcred);
>  struct super_block *ovl_same_sb(struct super_block *sb);
>  int ovl_can_decode_fh(struct super_block *sb);
>  struct dentry *ovl_indexdir(struct super_block *sb);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index ec237035333a..e38eea8104be 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -20,6 +20,7 @@ struct ovl_config {
>  	bool nfs_export;
>  	int xino;
>  	bool metacopy;
> +	bool override_creds;
>  };
>  
>  struct ovl_sb {
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index cc8303a806b4..ec591b49e902 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
>  		}
>  		inode_unlock(dir->d_inode);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -921,7 +921,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	err = ovl_dir_read_merged(dentry, list, &root);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (err)
>  		return err;
>  
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 0116735cc321..933829ca7d7d 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -56,6 +56,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_xino_auto_def,
>  		 "Auto enable xino feature");
>  
> +static bool __read_mostly ovl_override_creds_def = true;
> +module_param_named(override_creds, ovl_override_creds_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_override_creds_def,
> +		 "Use mounter's credentials for accesses");
> +
>  static void ovl_entry_stack_free(struct ovl_entry *oe)
>  {
>  	unsigned int i;
> @@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>  	if (ofs->config.metacopy != ovl_metacopy_def)
>  		seq_printf(m, ",metacopy=%s",
>  			   ofs->config.metacopy ? "on" : "off");
> +	if (ofs->config.override_creds != ovl_override_creds_def)
> +		seq_show_option(m, "override_creds",
> +				ofs->config.override_creds ? "on" : "off");
>  	return 0;
>  }
>  
> @@ -401,6 +409,8 @@ enum {
>  	OPT_XINO_AUTO,
>  	OPT_METACOPY_ON,
>  	OPT_METACOPY_OFF,
> +	OPT_OVERRIDE_CREDS_ON,
> +	OPT_OVERRIDE_CREDS_OFF,
>  	OPT_ERR,
>  };
>  
> @@ -419,6 +429,8 @@ static const match_table_t ovl_tokens = {
>  	{OPT_XINO_AUTO,			"xino=auto"},
>  	{OPT_METACOPY_ON,		"metacopy=on"},
>  	{OPT_METACOPY_OFF,		"metacopy=off"},
> +	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
> +	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
>  	{OPT_ERR,			NULL}
>  };
>  
> @@ -477,6 +489,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
>  	if (!config->redirect_mode)
>  		return -ENOMEM;
> +	config->override_creds = ovl_override_creds_def;
>  
>  	while ((p = ovl_next_opt(&opt)) != NULL) {
>  		int token;
> @@ -557,6 +570,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  			config->metacopy = false;
>  			break;
>  
> +		case OPT_OVERRIDE_CREDS_ON:
> +			config->override_creds = true;
> +			break;
> +
> +		case OPT_OVERRIDE_CREDS_OFF:
> +			config->override_creds = false;
> +			break;
> +
>  		default:
>  			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>  			return -EINVAL;
> @@ -1549,7 +1570,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		       ovl_dentry_lower(root_dentry), NULL);
>  
>  	sb->s_root = root_dentry;
> -
>  	return 0;
>  
>  out_free_oe:
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 7c01327b1852..484d7f76ac9c 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
>  
> +	if (!ofs->config.override_creds)
> +		return NULL;
>  	return override_creds(ofs->creator_cred);
>  }
>  
> +void ovl_revert_creds(const struct cred *old_cred)
> +{
> +	if (old_cred)
> +		revert_creds(old_cred);
> +}
> +
>  struct super_block *ovl_same_sb(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
> @@ -782,7 +790,7 @@ int ovl_nlink_start(struct dentry *dentry)
>  	 * value relative to the upper inode nlink in an upper inode xattr.
>  	 */
>  	err = ovl_set_nlink_upper(dentry);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  out:
>  	if (err)
> @@ -800,7 +808,7 @@ void ovl_nlink_end(struct dentry *dentry)
>  
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		ovl_cleanup_index(dentry);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  	}
>  
>  	ovl_inode_unlock(inode);
> -- 
> 2.19.1.930.g4563a0d9d0-goog
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-06 23:01 ` [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
  2018-11-08 15:56   ` Vivek Goyal
@ 2018-11-08 20:01   ` Vivek Goyal
       [not found]     ` <b057da99-adc6-b355-fb57-b314a29f298f@android.com>
  2018-11-14 22:00   ` Vivek Goyal
  2 siblings, 1 reply; 10+ messages in thread
From: Vivek Goyal @ 2018-11-08 20:01 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, kernel-team

On Tue, Nov 06, 2018 at 03:01:15PM -0800, Mark Salyzyn wrote:
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.

Ok, I am trying to think of scenarios where override_creds=off can 
provide any privilege escalation. How about following.

$ mkdir lower lower/foo upper upper/foo work merged
$ touch lower/foo/bar.txt
$ chmod 700 lower/foo/

# Mount overlay with override_creds=off

$ mount -t overlay -o
lowerdir=lower,upperdir=upper,workdir=work,override_creds=off none merged

# Try to read lower/foo as unpriviliged user. Say "test"
# su test
# ls merged/foo/
ls: cannot access 'merged/foo/': Operation not permitted

# Now first try to do same operation as root and retry as test user.
$ exit
$ ls merged/foo
bar.txt
$ su test
$ ls merged/foo
bar.txt

lower/foo/ is not readable by user "test". So it fails in first try. Later
"root" accesses it and it populates cache in overlayfs. When test retries,
it gets these entries from cache.

With override_creds=on this is not a problem because overlay provides
this as functionality as long as mounter as access to lower/foo/.

But with override_creds=off, mounter is not providing any such
functionality and we are exposing an issue where cache will make
something available which is not normally available.

I think it probably is a good idea to do something about it?

Thanks
Vivek


> 
> If the principles of least privilege are applied, the mounter's
> credentials might not overlap the credentials of the caller's when
> accessing the overlayfs filesystem.  For example, a file that a lower
> DAC privileged caller can execute, is MAC denied to the generally
> higher DAC privileged mounter, to prevent an attack vector.
> 
> We add the option to turn off override_creds in the mount options; all
> subsequent operations after mount on the filesystem will be only the
> caller's credentials.  The module boolean parameter and mount option
> override_creds is also added as a presence check for this "feature",
> existence of /sys/module/overlay/parameters/override_creds.
> 
> It was not always this way.  Circa 4.4 there was no recorded mounter's
> credentials, instead privileged access to upper or work directories
> were temporarily increased to perform the operations.  The MAC
> (selinux) policies were caller's in all cases.  override_creds=off
> partially returns us to this older access model minus the insecure
> temporary credential increases.  This is to permit use in a system
> with non-overlapping security models for each executable including
> the agent that mounts the overlayfs filesystem.  In Android
> this is the case since init, which performs the mount operations,
> has a minimal MAC set of privileges to reduce any attack surface,
> and services that use the content have a different set of MAC
> privileges (eg: read, for vendor labelled configuration, execute for
> vendor libraries and modules).
> 
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@android.com
> ---
> v8:
> - drop pr_warn message after straw poll to remove it.
> - added a use case in the commit message
> 
> v7:
> - change name of internal parameter to ovl_override_creds_def
> - report override_creds only if different than default
> 
> v6:
> - Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
> - Do better with the documentation.
> - pr_warn message adjusted to report consequences.
> 
> v5:
> - beefed up the caveats in the Documentation
> - Is dependent on
>   "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
>   "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
> - Added prwarn when override_creds=off
> 
> v4:
> - spelling and grammar errors in text
> 
> v3:
> - Change name from caller_credentials / creator_credentials to the
>   boolean override_creds.
> - Changed from creator to mounter credentials.
> - Updated and fortified the documentation.
> - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
> 
> v2:
> - Forward port changed attr to stat, resulting in a build error.
> - altered commit message.
> 
>  Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
>  fs/overlayfs/copy_up.c                  |  2 +-
>  fs/overlayfs/dir.c                      |  9 +++++----
>  fs/overlayfs/inode.c                    | 16 ++++++++--------
>  fs/overlayfs/namei.c                    |  6 +++---
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/readdir.c                  |  4 ++--
>  fs/overlayfs/super.c                    | 22 +++++++++++++++++++++-
>  fs/overlayfs/util.c                     | 12 ++++++++++--
>  10 files changed, 69 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index eef7d9d259e8..5cc299df4436 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -102,6 +102,23 @@ Only the lists of names from directories are merged.  Other content
>  such as metadata and extended attributes are reported for the upper
>  directory only.  These attributes of the lower directory are hidden.
>  
> +credentials
> +-----------
> +
> +By default, all access to the upper, lower and work directories is the
> +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> +checked against the caller's credentials.
> +
> +override_creds mount flag turned off is reserved for when mounter and
> +caller MAC or DAC credentials do not overlap.  Several unintended side
> +effects will occur.  The caller with a lower privilege will not be
> +able to delete files or directories, create nodes, or search some
> +directories.  The caller with higher privilege can perform unexpected
> +or unsecured operations.  The uneven security model where upperdir
> +and workdir are opened at privilege, but accessed without, should only
> +be used with strict understanding of the side effects and of the
> +security policies.
> +
>  whiteouts and opaque directories
>  --------------------------------
>  
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 9e62dcf06fc4..dfab62ce7504 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -860,7 +860,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
>  		dput(parent);
>  		dput(next);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index c6289147c787..b7052e23c467 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -566,7 +566,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  		override_cred->fsgid = inode->i_gid;
>  		if (!attr->hardlink) {
>  			err = security_dentry_create_files_as(dentry,
> -					attr->mode, &dentry->d_name, old_cred,
> +					attr->mode, &dentry->d_name,
> +					old_cred ? old_cred : current_cred(),
>  					override_cred);
>  			if (err) {
>  				put_cred(override_cred);
> @@ -582,7 +583,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  			err = ovl_create_over_whiteout(dentry, inode, attr);
>  	}
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return err;
>  }
>  
> @@ -842,7 +843,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
>  		err = ovl_remove_upper(dentry, is_dir, &list);
>  	else
>  		err = ovl_remove_and_whiteout(dentry, &list);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (!err) {
>  		if (is_dir)
>  			clear_nlink(dentry->d_inode);
> @@ -1212,7 +1213,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>  out_unlock:
>  	unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (update_nlink)
>  		ovl_nlink_end(new);
>  out_drop_write:
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 6bcc9dedc342..192f5508ed45 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -64,7 +64,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  		inode_lock(upperdentry->d_inode);
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		err = notify_change(upperdentry, attr, NULL);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  		if (!err)
>  			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
>  		inode_unlock(upperdentry->d_inode);
> @@ -260,7 +260,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		stat->nlink = dentry->d_inode->i_nlink;
>  
>  out:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -303,7 +303,7 @@ int ovl_permission(struct inode *inode, int mask)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	err = inode_permission(realinode, mask);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -320,7 +320,7 @@ static const char *ovl_get_link(struct dentry *dentry,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	p = vfs_get_link(ovl_dentry_real(dentry), done);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return p;
>  }
>  
> @@ -363,7 +363,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>  		WARN_ON(flags != XATTR_REPLACE);
>  		err = vfs_removexattr(realdentry, name);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	/* copy c/mtime */
>  	ovl_copyattr(d_inode(realdentry), inode);
> @@ -384,7 +384,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_getxattr(realdentry, name, value, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return res;
>  }
>  
> @@ -408,7 +408,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_listxattr(realdentry, list, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (res <= 0 || size == 0)
>  		return res;
>  
> @@ -443,7 +443,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	acl = get_acl(realinode, type);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return acl;
>  }
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 3ac9dc8f6cc0..c32fa8ed72e6 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1072,7 +1072,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			goto out_free_oe;
>  	}
>  
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (origin_path) {
>  		dput(origin_path->dentry);
>  		kfree(origin_path);
> @@ -1099,7 +1099,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  	kfree(upperredirect);
>  out:
>  	kfree(d.redirect);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return ERR_PTR(err);
>  }
>  
> @@ -1153,7 +1153,7 @@ bool ovl_lower_positive(struct dentry *dentry)
>  			dput(this);
>  		}
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return positive;
>  }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 5e45cb3630a0..6f8b6f9ff357 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -208,6 +208,7 @@ int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
> +void ovl_revert_creds(const struct cred *oldcred);
>  struct super_block *ovl_same_sb(struct super_block *sb);
>  int ovl_can_decode_fh(struct super_block *sb);
>  struct dentry *ovl_indexdir(struct super_block *sb);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index ec237035333a..e38eea8104be 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -20,6 +20,7 @@ struct ovl_config {
>  	bool nfs_export;
>  	int xino;
>  	bool metacopy;
> +	bool override_creds;
>  };
>  
>  struct ovl_sb {
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index cc8303a806b4..ec591b49e902 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
>  		}
>  		inode_unlock(dir->d_inode);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -921,7 +921,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	err = ovl_dir_read_merged(dentry, list, &root);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (err)
>  		return err;
>  
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 0116735cc321..933829ca7d7d 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -56,6 +56,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_xino_auto_def,
>  		 "Auto enable xino feature");
>  
> +static bool __read_mostly ovl_override_creds_def = true;
> +module_param_named(override_creds, ovl_override_creds_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_override_creds_def,
> +		 "Use mounter's credentials for accesses");
> +
>  static void ovl_entry_stack_free(struct ovl_entry *oe)
>  {
>  	unsigned int i;
> @@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>  	if (ofs->config.metacopy != ovl_metacopy_def)
>  		seq_printf(m, ",metacopy=%s",
>  			   ofs->config.metacopy ? "on" : "off");
> +	if (ofs->config.override_creds != ovl_override_creds_def)
> +		seq_show_option(m, "override_creds",
> +				ofs->config.override_creds ? "on" : "off");
>  	return 0;
>  }
>  
> @@ -401,6 +409,8 @@ enum {
>  	OPT_XINO_AUTO,
>  	OPT_METACOPY_ON,
>  	OPT_METACOPY_OFF,
> +	OPT_OVERRIDE_CREDS_ON,
> +	OPT_OVERRIDE_CREDS_OFF,
>  	OPT_ERR,
>  };
>  
> @@ -419,6 +429,8 @@ static const match_table_t ovl_tokens = {
>  	{OPT_XINO_AUTO,			"xino=auto"},
>  	{OPT_METACOPY_ON,		"metacopy=on"},
>  	{OPT_METACOPY_OFF,		"metacopy=off"},
> +	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
> +	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
>  	{OPT_ERR,			NULL}
>  };
>  
> @@ -477,6 +489,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
>  	if (!config->redirect_mode)
>  		return -ENOMEM;
> +	config->override_creds = ovl_override_creds_def;
>  
>  	while ((p = ovl_next_opt(&opt)) != NULL) {
>  		int token;
> @@ -557,6 +570,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  			config->metacopy = false;
>  			break;
>  
> +		case OPT_OVERRIDE_CREDS_ON:
> +			config->override_creds = true;
> +			break;
> +
> +		case OPT_OVERRIDE_CREDS_OFF:
> +			config->override_creds = false;
> +			break;
> +
>  		default:
>  			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>  			return -EINVAL;
> @@ -1549,7 +1570,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		       ovl_dentry_lower(root_dentry), NULL);
>  
>  	sb->s_root = root_dentry;
> -
>  	return 0;
>  
>  out_free_oe:
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 7c01327b1852..484d7f76ac9c 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
>  
> +	if (!ofs->config.override_creds)
> +		return NULL;
>  	return override_creds(ofs->creator_cred);
>  }
>  
> +void ovl_revert_creds(const struct cred *old_cred)
> +{
> +	if (old_cred)
> +		revert_creds(old_cred);
> +}
> +
>  struct super_block *ovl_same_sb(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
> @@ -782,7 +790,7 @@ int ovl_nlink_start(struct dentry *dentry)
>  	 * value relative to the upper inode nlink in an upper inode xattr.
>  	 */
>  	err = ovl_set_nlink_upper(dentry);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  out:
>  	if (err)
> @@ -800,7 +808,7 @@ void ovl_nlink_end(struct dentry *dentry)
>  
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		ovl_cleanup_index(dentry);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  	}
>  
>  	ovl_inode_unlock(inode);
> -- 
> 2.19.1.930.g4563a0d9d0-goog
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
       [not found]     ` <b057da99-adc6-b355-fb57-b314a29f298f@android.com>
@ 2018-11-09  3:05       ` Amir Goldstein
  2018-11-09 17:32         ` Mark Salyzyn
  2018-11-09 15:20       ` Vivek Goyal
  1 sibling, 1 reply; 10+ messages in thread
From: Amir Goldstein @ 2018-11-09  3:05 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: Vivek Goyal, linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W. Biederman, Randy Dunlap, Stephen Smalley, overlayfs,
	linux-doc, kernel-team

On Thu, Nov 8, 2018 at 11:28 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> On 11/08/2018 12:01 PM, Vivek Goyal wrote:
>
> On Tue, Nov 06, 2018 at 03:01:15PM -0800, Mark Salyzyn wrote:
>
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.
>
> Ok, I am trying to think of scenarios where override_creds=off can
> provide any privilege escalation. How about following.
>
> $ mkdir lower lower/foo upper upper/foo work merged
> $ touch lower/foo/bar.txt
> $ chmod 700 lower/foo/
>
> # Mount overlay with override_creds=off
>
> $ mount -t overlay -o
> lowerdir=lower,upperdir=upper,workdir=work,override_creds=off none merged
>
> # Try to read lower/foo as unpriviliged user. Say "test"
> # su test
> # ls merged/foo/
> ls: cannot access 'merged/foo/': Operation not permitted
>
> # Now first try to do same operation as root and retry as test user.
> $ exit
> $ ls merged/foo
> bar.txt
> $ su test
> $ ls merged/foo
> bar.txt
>
> lower/foo/ is not readable by user "test". So it fails in first try. Later
> "root" accesses it and it populates cache in overlayfs. When test retries,
> it gets these entries from cache.
>
> With override_creds=on this is not a problem because overlay provides
> this as functionality as long as mounter as access to lower/foo/.
>
> But with override_creds=off, mounter is not providing any such
> functionality and we are exposing an issue where cache will make
> something available which is not normally available.
>
> I think it probably is a good idea to do something about it?
>
> Thanks
> Vivek
>
> Good stuff.
>
> That sounds like a bug in cache (!) to not recheck caller's credentials. Currently unsure how/where to force bypass of the cache (performance hit) as it is wired in throughout the code without a clear off switch, or rechecking of the credentials at access. This does need to be addressed to make this 'feature' more useful/trusted for non-MAC controlled, use cases.
>
> This is not a problem in the Android usage case since DAC is simple and all can be read, execute bits might be controlled, the owners and perms are otherwise unremarkable in the affected arenas that are utilizing overlayfs. Not using it for a generic r/w backing except in rooted debug scenarios by developers, otherwise everything is r/o, MAC on the other hand is complex and heavily inspected. We do, however, want multi level security in that both DAC and MAC can be trusted and can protect each other from holes.
>
> Sounds like the caveats in the documentation need to be expanded if _no_ solution for this kind of access pattern becomes apparent.
>

I think maybe you could append the "behavior of the overlay is undefined" clause
to the caveats to cover issues like the one raised by Vivek.

Mark,

I have some Android internals background, so I have a general
understanding of the
use case, but I can understand why people have a hard time connecting to the
motivation, thinking "their security model must be flawed".

I am not sure if you are avoiding laying out the details of the model
because you
are not allowed to expose details or because you feel details may confuse us.
If the latter, then I think that actually listing the details of the
overlays used in Android
and some concrete examples of access policies to those overlays could
benefit the
discussion on the feature.
To clarify, this only a suggestion. I have no objection to the patch.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
       [not found]     ` <b057da99-adc6-b355-fb57-b314a29f298f@android.com>
  2018-11-09  3:05       ` Amir Goldstein
@ 2018-11-09 15:20       ` Vivek Goyal
  1 sibling, 0 replies; 10+ messages in thread
From: Vivek Goyal @ 2018-11-09 15:20 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, kernel-team

On Thu, Nov 08, 2018 at 01:28:32PM -0800, Mark Salyzyn wrote:
> On 11/08/2018 12:01 PM, Vivek Goyal wrote:
> > On Tue, Nov 06, 2018 at 03:01:15PM -0800, Mark Salyzyn wrote:
> > > By default, all access to the upper, lower and work directories is the
> > > recorded mounter's MAC and DAC credentials.  The incoming accesses are
> > > checked against the caller's credentials.
> > Ok, I am trying to think of scenarios where override_creds=off can
> > provide any privilege escalation. How about following.
> > 
> > $ mkdir lower lower/foo upper upper/foo work merged
> > $ touch lower/foo/bar.txt
> > $ chmod 700 lower/foo/
> > 
> > # Mount overlay with override_creds=off
> > 
> > $ mount -t overlay -o
> > lowerdir=lower,upperdir=upper,workdir=work,override_creds=off none merged
> > 
> > # Try to read lower/foo as unpriviliged user. Say "test"
> > # su test
> > # ls merged/foo/
> > ls: cannot access 'merged/foo/': Operation not permitted
> > 
> > # Now first try to do same operation as root and retry as test user.
> > $ exit
> > $ ls merged/foo
> > bar.txt
> > $ su test
> > $ ls merged/foo
> > bar.txt
> > 
> > lower/foo/ is not readable by user "test". So it fails in first try. Later
> > "root" accesses it and it populates cache in overlayfs. When test retries,
> > it gets these entries from cache.
> > 
> > With override_creds=on this is not a problem because overlay provides
> > this as functionality as long as mounter as access to lower/foo/.
> > 
> > But with override_creds=off, mounter is not providing any such
> > functionality and we are exposing an issue where cache will make
> > something available which is not normally available.
> > 
> > I think it probably is a good idea to do something about it?
> > 
> > Thanks
> > Vivek
> > 
> Good stuff.
> 
> That sounds like a bug in cache (!) to not recheck caller's credentials.
> Currently unsure how/where to force bypass of the cache (performance hit) as
> it is wired in throughout the code without a clear off switch, or rechecking
> of the credentials at access. This does need to be addressed to make this
> 'feature' more useful/trusted for non-MAC controlled, use cases.

DAC is just an example. There is no reason same issue will not happen
with MAC? Proacess A with correct MAC priviliges will fill overlay
cache and process B without correct MAC priviliges will still be able
to get information about dentry.

As Amir suggested, for now documenting this probably is fine. I can't
think of any other good options either. May be Miklos has some ideas here.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-09  3:05       ` Amir Goldstein
@ 2018-11-09 17:32         ` Mark Salyzyn
  2018-11-10  7:51           ` Amir Goldstein
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Salyzyn @ 2018-11-09 17:32 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Vivek Goyal, linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W. Biederman, Randy Dunlap, Stephen Smalley, overlayfs,
	linux-doc, kernel-team

On 11/08/2018 07:05 PM, Amir Goldstein wrote:
> On Thu, Nov 8, 2018 at 11:28 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> On 11/08/2018 12:01 PM, Vivek Goyal wrote:
>>
>> On Tue, Nov 06, 2018 at 03:01:15PM -0800, Mark Salyzyn wrote:
>>
>> By default, all access to the upper, lower and work directories is the
>> recorded mounter's MAC and DAC credentials.  The incoming accesses are
>> checked against the caller's credentials.
>>
>> Ok, I am trying to think of scenarios where override_creds=off can
>> provide any privilege escalation. How about following.
>>
>> $ mkdir lower lower/foo upper upper/foo work merged
>> $ touch lower/foo/bar.txt
>> $ chmod 700 lower/foo/
>>
>> # Mount overlay with override_creds=off
>>
>> $ mount -t overlay -o
>> lowerdir=lower,upperdir=upper,workdir=work,override_creds=off none merged
>>
>> # Try to read lower/foo as unpriviliged user. Say "test"
>> # su test
>> # ls merged/foo/
>> ls: cannot access 'merged/foo/': Operation not permitted
>>
>> # Now first try to do same operation as root and retry as test user.
>> $ exit
>> $ ls merged/foo
>> bar.txt
>> $ su test
>> $ ls merged/foo
>> bar.txt
>>
>> lower/foo/ is not readable by user "test". So it fails in first try. Later
>> "root" accesses it and it populates cache in overlayfs. When test retries,
>> it gets these entries from cache.
>>
>> With override_creds=on this is not a problem because overlay provides
>> this as functionality as long as mounter as access to lower/foo/.
>>
>> But with override_creds=off, mounter is not providing any such
>> functionality and we are exposing an issue where cache will make
>> something available which is not normally available.
>>
>> I think it probably is a good idea to do something about it?
>>
>> Thanks
>> Vivek
>>
>> Good stuff.
>>
>> That sounds like a bug in cache (!) to not recheck caller's credentials. Currently unsure how/where to force bypass of the cache (performance hit) as it is wired in throughout the code without a clear off switch, or rechecking of the credentials at access. This does need to be addressed to make this 'feature' more useful/trusted for non-MAC controlled, use cases.
>>
>> This is not a problem in the Android usage case since DAC is simple and all can be read, execute bits might be controlled, the owners and perms are otherwise unremarkable in the affected arenas that are utilizing overlayfs. Not using it for a generic r/w backing except in rooted debug scenarios by developers, otherwise everything is r/o, MAC on the other hand is complex and heavily inspected. We do, however, want multi level security in that both DAC and MAC can be trusted and can protect each other from holes.
>>
>> Sounds like the caveats in the documentation need to be expanded if _no_ solution for this kind of access pattern becomes apparent.
>>
> I think maybe you could append the "behavior of the overlay is undefined" clause
> to the caveats to cover issues like the one raised by Vivek.

Will respin with adjustment to documentation and head towards v9.

I would prefer to have a complete solution to the non-overlapping 
security model,
and maybe in time it will come, but admittedly only if motivated to use 
overlayfs on
Android's userdata (see below) where all these issues would need to be 
solved.

Consider this (far simpler) patch a shot across the bow that an 
overlayfs use case was
removed in 4.6, and for completeness at least we would like to see it 
back. Would the ideal
be that the creator's credentials are only used to facilitate some 
workdir or r/w upperdir
operations, and user credentials are rechecked in more places in cache. From
my perspective feels like whack a mole for a few iterations. In that 
case, this option is
the start of those iterations "if you build it they will come"?

The first one would be an investigation on how to solve Vivek's scenario 
in the face of this change? (not volunteering today to do that, 
admitting to a shortcoming, and a need for a deeper understanding of the 
code for the moment)
>
> Mark,
>
> I have some Android internals background, so I have a general
> understanding of the
> use case, but I can understand why people have a hard time connecting to the
> motivation, thinking "their security model must be flawed".
>
> I am not sure if you are avoiding laying out the details of the model
> because you
> are not allowed to expose details or because you feel details may confuse us.

I am not a "great communicator"(tm), probably only 50K vocabulary,
propensity towards quantum leaps, so yes, I was worried about confusion.
non-overlapping security model is the key takeaway I feel.

[TL;DR]

In Android there are two use cases this covers:

1) On userdebug (rooted development) builds, adb remount feature
    for readonly filesystems which include squashfs, ext4 dedupe,
    and any right-sized (zero space left over) filesystems.  In these cases
    the system will resort to utilizing overlayfs, and allow for
    updated content to a scratch backing storage.
2) On operating system updates where new Hardware Abstraction
    layers have been added and the vendor/oem supplied components
    supplied to an older release.  In this corner case the operating
    system update may carefully select overrides that are merged into
    the vendor partition content directories as hosted by the
    operating system partition.

The sepolicy model can be browsed at 
https://android.googlesource.com/platform/system/sepolicy/.

In the first use case above the possible insecurity is tempered by the 
debug nature
of the system and the lurking big elephant in the room privilege 
escalation possibility
(/system/bin/su existing), in the other a r/o and precise MAC. sepolicy 
and credentials
will rule over transitions from one security domain to another for 
execution, vendor components are managed by a separate vendor_init and 
the actual xattr content is constant.

Being able to block reading a file or directory is not a big concern in 
the associated trees, because all of the originals are backed by a r/o 
filesystem image, the content
is generally visible by all, and if not they are locally restricted 
views into a public filesystem image, that themselves can be mounted on 
a desktop. There are no privilege escalation or privacy issues.

An Android use case this does not cover securely is when overlayfs is 
used as a
snapshot of the users data during the update process to permit easy 
rollback of user data due to an
update failure (very unlikely 0.01%, but alas there are tachyons with 
our names on them).
For those use cases we have opted to add snapshot-ting to
f2fs and ext4 (using dm-bow in another patch set under discussion for 
the time being)
and abandoned overlayfs as insufficient in a dynamic non-overlapping DAC 
and MAC security model.
Built in filesystem snapshot-ting is always preferred for performance,
efficiency and access to the free block pools they maintain so that was an
easy choice.

> If the latter, then I think that actually listing the details of the
> overlays used in Android
> and some concrete examples of access policies to those overlays could
> benefit the
> discussion on the feature.
> To clarify, this only a suggestion. I have no objection to the patch.
>
> Thanks,
> Amir.

-- Mark



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-09 17:32         ` Mark Salyzyn
@ 2018-11-10  7:51           ` Amir Goldstein
  2018-11-12 18:15             ` Mark Salyzyn
  0 siblings, 1 reply; 10+ messages in thread
From: Amir Goldstein @ 2018-11-10  7:51 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: Vivek Goyal, linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W. Biederman, Randy Dunlap, Stephen Smalley, overlayfs,
	linux-doc, kernel-team, Paul Lawrence, Theodore Tso

On Fri, Nov 9, 2018 at 7:32 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> On 11/08/2018 07:05 PM, Amir Goldstein wrote:
> >
> > Mark,
> >
> > I have some Android internals background, so I have a general
> > understanding of the
> > use case, but I can understand why people have a hard time connecting to the
> > motivation, thinking "their security model must be flawed".
> >
> > I am not sure if you are avoiding laying out the details of the model
> > because you
> > are not allowed to expose details or because you feel details may confuse us.
>
> I am not a "great communicator"(tm), probably only 50K vocabulary,
> propensity towards quantum leaps, so yes, I was worried about confusion.
> non-overlapping security model is the key takeaway I feel.
>

I hope my comment below will serve as an example why explaining your use
case is preferred to presenting the abstract and generalized problem.
And still - no objections to your current patch, *because* it can solve your
use cases and *because* we don't need to deal with the  abstract and
generalized problem.

> [TL;DR]
>
> In Android there are two use cases this covers:
>
> 1) On userdebug (rooted development) builds, adb remount feature
>     for readonly filesystems which include squashfs, ext4 dedupe,
>     and any right-sized (zero space left over) filesystems.  In these cases
>     the system will resort to utilizing overlayfs, and allow for
>     updated content to a scratch backing storage.
> 2) On operating system updates where new Hardware Abstraction
>     layers have been added and the vendor/oem supplied components
>     supplied to an older release.  In this corner case the operating
>     system update may carefully select overrides that are merged into
>     the vendor partition content directories as hosted by the
>     operating system partition.
>
> The sepolicy model can be browsed at
> https://android.googlesource.com/platform/system/sepolicy/.
>
> In the first use case above the possible insecurity is tempered by the
> debug nature
> of the system and the lurking big elephant in the room privilege
> escalation possibility
> (/system/bin/su existing),

Since you already have an elephant in the room, might as well use it
to mount overlay. I am guessing most of the work in developer mode is
with sepolicy disabled anyway?
Essentially, adb root/adb remount means gloves are off.
Although with overlayed adb remount gloves could be put back on
when you relinquish the overlay and get back to original /system mount.

> in the other a r/o and precise MAC. sepolicy
> and credentials
> will rule over transitions from one security domain to another for
> execution, vendor components are managed by a separate vendor_init and
> the actual xattr content is constant.
>

For this use case, you don't need an upper layer at all and you probably
don't use lower layers redirect_dir. Right?
So all the concerns about get/set trusted overlay xattrs and detecting
opaque directories (do you need those?) are moot.
If you propose that override_creds=off can only be enabled
on lower-only overlayfs, the caveats section of your documentation
would shrink down considerably to the point that it may even be
comprehend-able to mortal users and I don't think you will see much
resistant from overlayfs developers to that "safe side" approach.

> Being able to block reading a file or directory is not a big concern in
> the associated trees, because all of the originals are backed by a r/o
> filesystem image, the content
> is generally visible by all, and if not they are locally restricted
> views into a public filesystem image, that themselves can be mounted on
> a desktop. There are no privilege escalation or privacy issues.
>
> An Android use case this does not cover securely is when overlayfs is
> used as a
> snapshot of the users data during the update process to permit easy
> rollback of user data due to an
> update failure (very unlikely 0.01%, but alas there are tachyons with
> our names on them).
> For those use cases we have opted to add snapshot-ting to
> f2fs and ext4 (using dm-bow in another patch set under discussion for
> the time being)

Very interesting... that's a use case of overlayfs snapshots [1] I was
not aware of.
With regular overlayfs, mounter creds are used to create the "master" copy
while "origin" is left unchanged. With overlayfs snapshots, the "master" copy
is modified by user while the "backup" copy is created with the snapshot
mounter credentials.

> and abandoned overlayfs as insufficient in a dynamic non-overlapping DAC
> and MAC security model.

Essentially, dm-bow target has the same capabilities as overlayfs driver with
mounter creds. When creating the backup copies of modified blocks, it does
not consult the per file granular security policy - not sure if that
is a security
concern? (cc: Paul Lawrence)

As a matter of fact, because the overlayfs snapshots "backup repository"
is a directory tree, granular per file security policy could be applied during
BOW and/or during commit.

> Built in filesystem snapshot-ting is always preferred for performance,
> efficiency and access to the free block pools they maintain so that was an
> easy choice.
>

That's a bold statement and not entirely true.

Overlayfs (as do overlayfs snapshots) is much more efficient in page cache
usage (for unmodified files) than btrfs snapshots and dm thinp.
There is no issue with accessing the "free blocks pool" when the snapshot
solution is at the vfs level (as opposed to block level).
In fact, these two characteristics are probably the main thing going for
overlayfs as a choice for container storage driver.

Nevertheless, it doesn't seem like the Android userdata rollback feature
should care about efficiency of snapshot access at all.

All in all, I think that dm-bow is a neat solution for the specific use case,
but do not confuse it with "built-in filesystem snapshot support".
Getting there with ext4 is much harder. I know because I tried... [2].

Thanks,
Amir.


[1] https://github.com/amir73il/overlayfs/wiki/Overlayfs-snapshots
[2] https://github.com/amir73il/next3-utils/wiki/Ext4-snapshots-TODO

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-10  7:51           ` Amir Goldstein
@ 2018-11-12 18:15             ` Mark Salyzyn
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Salyzyn @ 2018-11-12 18:15 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Vivek Goyal, linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W. Biederman, Randy Dunlap, Stephen Smalley, overlayfs,
	linux-doc, kernel-team, Paul Lawrence, Theodore Tso

On 11/09/2018 11:51 PM, Amir Goldstein wrote:
> On Fri, Nov 9, 2018 at 7:32 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> On 11/08/2018 07:05 PM, Amir Goldstein wrote:
>>> Mark,
>>>
>>> I have some Android internals background, so I have a general
>>> understanding of the
>>> use case, but I can understand why people have a hard time connecting to the
>>> motivation, thinking "their security model must be flawed".
>>>
>>> I am not sure if you are avoiding laying out the details of the model
>>> because you
>>> are not allowed to expose details or because you feel details may confuse us.
>> I am not a "great communicator"(tm), probably only 50K vocabulary,
>> propensity towards quantum leaps, so yes, I was worried about confusion.
>> non-overlapping security model is the key takeaway I feel.
>>
> I hope my comment below will serve as an example why explaining your use
> case is preferred to presenting the abstract and generalized problem.
> And still - no objections to your current patch, *because* it can solve your
> use cases and *because* we don't need to deal with the  abstract and
> generalized problem.
Alas, I personally object to changes to the kernel that have a limited 
(well, 1.8billion devices is not limited) use case. If they can benefit 
others, they become more useful. This paradigm helps prevent spaghetti 
and areas of the kernel that are not well understood and subject to bitrot.
>> [TL;DR]
>>
>> In Android there are two use cases this covers:
>>
>> 1) On userdebug (rooted development) builds, adb remount feature
>>      for readonly filesystems which include squashfs, ext4 dedupe,
>>      and any right-sized (zero space left over) filesystems.  In these cases
>>      the system will resort to utilizing overlayfs, and allow for
>>      updated content to a scratch backing storage.
>> 2) On operating system updates where new Hardware Abstraction
>>      layers have been added and the vendor/oem supplied components
>>      supplied to an older release.  In this corner case the operating
>>      system update may carefully select overrides that are merged into
>>      the vendor partition content directories as hosted by the
>>      operating system partition.
>>
>> The sepolicy model can be browsed at
>> https://android.googlesource.com/platform/system/sepolicy/.
>>
>> In the first use case above the possible insecurity is tempered by the
>> debug nature
>> of the system and the lurking big elephant in the room privilege
>> escalation possibility
>> (/system/bin/su existing),
> Since you already have an elephant in the room, might as well use it
> to mount overlay. I am guessing most of the work in developer mode is
> with sepolicy disabled anyway?
> Essentially, adb root/adb remount means gloves are off.
> Although with overlayed adb remount gloves could be put back on
> when you relinquish the overlay and get back to original /system mount.
adb starts far too late to be useful for the original mount operation. 
We have to mount at init first stage before sepolicy is loaded and init 
is re-exec'd. adb root / adb remount / adb push is relegated to updating 
the content before a reboot to activate _some_ of the possible things 
that developers adjust.

Yes, I find the ability to relinquish the overlayfs and restore the 
system to original an added-value, whereas if we could adjust the 
system&root filesystems directly, a complete reflash is required to 
restore back. We only activate/use the overlayfs for read-only 
system&root filesystems.
>
>> in the other a r/o and precise MAC. sepolicy
>> and credentials
>> will rule over transitions from one security domain to another for
>> execution, vendor components are managed by a separate vendor_init and
>> the actual xattr content is constant.
>>
> For this use case, you don't need an upper layer at all and you probably
> don't use lower layers redirect_dir. Right?
> So all the concerns about get/set trusted overlay xattrs and detecting
> opaque directories (do you need those?) are moot.
> If you propose that override_creds=off can only be enabled
> on lower-only overlayfs, the caveats section of your documentation
> would shrink down considerably to the point that it may even be
> comprehend-able to mortal users and I don't think you will see much
> resistant from overlayfs developers to that "safe side" approach.

Yup, no upper layers. If this was the only uses case, but it isn't.

Besides, I submitted this patch _before_ this second use case presented 
itself as required, it was a later development during the lifetime of 
this patch, so admittedly still focused on the first use case.

I should regroup in the documentation and present the use cases 
individually and specifically ... as you suggest :-) Off to consider a 
v9 respin (KISS documentation has always been troublesome for me) ...

[TL;DR]

The remain snapshot discussion is a distraction, I will have to defer to 
Paul to answer to them (and yes, we have discussed eye-to-eye over this 
and private emails). I agree with the details you have added to the 
conversation.

Sadly the CL in question needs to solve the caveats before we could 
consider overlayfs for userdata.

Thanks for these comments! We should consider when the dust settles.

Sincerely -- Mark Salyzyn



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred
  2018-11-06 23:01 ` [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
  2018-11-08 15:56   ` Vivek Goyal
  2018-11-08 20:01   ` Vivek Goyal
@ 2018-11-14 22:00   ` Vivek Goyal
  2 siblings, 0 replies; 10+ messages in thread
From: Vivek Goyal @ 2018-11-14 22:00 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, Miklos Szeredi, Jonathan Corbet,
	Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, kernel-team

On Tue, Nov 06, 2018 at 03:01:15PM -0800, Mark Salyzyn wrote:
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.

Some random things. Not sure what's the correct answer. It might not
even be a issue, just trying to think loud.

- ovl_permission() does not do the check for permission on underlying
  inode if only MAY_EXEC is being asked for. This kind of sounds like
  a problem. That means one can create an overlay mount with context=<foo>
  and allow a process to execute a file which it could not execute
  outside overlay mount. If this is an issue, it probably is an issue
  both with override_creds=on/off.

- ovl_permission() does not check for permission on underlying inode
  for special file. Is it a problem where one can not do an operation
  on special device on host but can do it through overlay context
  mount.

- What about creds for copy up. ovl_prep_cu_creds(). Looks like even
  with override_creds=off, we will be switching to the creds as returned
  by security_inode_copy_up(). This basically sets ->create_sid if
  it is a context mount so that new inode gets created with same 
  label as context=<label>. I was worried about being able to create
  files as context=label (while mounter itself probaly could not do
  that). Dan Walsh mentioned that it might not be an issue because
  even if ->create_sid has been set, selinux will still check whether
  caller is allowed to create a file with that label in target dir
  or not.

Thanks
Vivek
> 
> If the principles of least privilege are applied, the mounter's
> credentials might not overlap the credentials of the caller's when
> accessing the overlayfs filesystem.  For example, a file that a lower
> DAC privileged caller can execute, is MAC denied to the generally
> higher DAC privileged mounter, to prevent an attack vector.
> 
> We add the option to turn off override_creds in the mount options; all
> subsequent operations after mount on the filesystem will be only the
> caller's credentials.  The module boolean parameter and mount option
> override_creds is also added as a presence check for this "feature",
> existence of /sys/module/overlay/parameters/override_creds.
> 
> It was not always this way.  Circa 4.4 there was no recorded mounter's
> credentials, instead privileged access to upper or work directories
> were temporarily increased to perform the operations.  The MAC
> (selinux) policies were caller's in all cases.  override_creds=off
> partially returns us to this older access model minus the insecure
> temporary credential increases.  This is to permit use in a system
> with non-overlapping security models for each executable including
> the agent that mounts the overlayfs filesystem.  In Android
> this is the case since init, which performs the mount operations,
> has a minimal MAC set of privileges to reduce any attack surface,
> and services that use the content have a different set of MAC
> privileges (eg: read, for vendor labelled configuration, execute for
> vendor libraries and modules).
> 
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@android.com
> ---
> v8:
> - drop pr_warn message after straw poll to remove it.
> - added a use case in the commit message
> 
> v7:
> - change name of internal parameter to ovl_override_creds_def
> - report override_creds only if different than default
> 
> v6:
> - Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
> - Do better with the documentation.
> - pr_warn message adjusted to report consequences.
> 
> v5:
> - beefed up the caveats in the Documentation
> - Is dependent on
>   "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
>   "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
> - Added prwarn when override_creds=off
> 
> v4:
> - spelling and grammar errors in text
> 
> v3:
> - Change name from caller_credentials / creator_credentials to the
>   boolean override_creds.
> - Changed from creator to mounter credentials.
> - Updated and fortified the documentation.
> - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
> 
> v2:
> - Forward port changed attr to stat, resulting in a build error.
> - altered commit message.
> 
>  Documentation/filesystems/overlayfs.txt | 17 +++++++++++++++++
>  fs/overlayfs/copy_up.c                  |  2 +-
>  fs/overlayfs/dir.c                      |  9 +++++----
>  fs/overlayfs/inode.c                    | 16 ++++++++--------
>  fs/overlayfs/namei.c                    |  6 +++---
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/readdir.c                  |  4 ++--
>  fs/overlayfs/super.c                    | 22 +++++++++++++++++++++-
>  fs/overlayfs/util.c                     | 12 ++++++++++--
>  10 files changed, 69 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index eef7d9d259e8..5cc299df4436 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -102,6 +102,23 @@ Only the lists of names from directories are merged.  Other content
>  such as metadata and extended attributes are reported for the upper
>  directory only.  These attributes of the lower directory are hidden.
>  
> +credentials
> +-----------
> +
> +By default, all access to the upper, lower and work directories is the
> +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> +checked against the caller's credentials.
> +
> +override_creds mount flag turned off is reserved for when mounter and
> +caller MAC or DAC credentials do not overlap.  Several unintended side
> +effects will occur.  The caller with a lower privilege will not be
> +able to delete files or directories, create nodes, or search some
> +directories.  The caller with higher privilege can perform unexpected
> +or unsecured operations.  The uneven security model where upperdir
> +and workdir are opened at privilege, but accessed without, should only
> +be used with strict understanding of the side effects and of the
> +security policies.
> +
>  whiteouts and opaque directories
>  --------------------------------
>  
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 9e62dcf06fc4..dfab62ce7504 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -860,7 +860,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
>  		dput(parent);
>  		dput(next);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index c6289147c787..b7052e23c467 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -566,7 +566,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  		override_cred->fsgid = inode->i_gid;
>  		if (!attr->hardlink) {
>  			err = security_dentry_create_files_as(dentry,
> -					attr->mode, &dentry->d_name, old_cred,
> +					attr->mode, &dentry->d_name,
> +					old_cred ? old_cred : current_cred(),
>  					override_cred);
>  			if (err) {
>  				put_cred(override_cred);
> @@ -582,7 +583,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>  			err = ovl_create_over_whiteout(dentry, inode, attr);
>  	}
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return err;
>  }
>  
> @@ -842,7 +843,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
>  		err = ovl_remove_upper(dentry, is_dir, &list);
>  	else
>  		err = ovl_remove_and_whiteout(dentry, &list);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (!err) {
>  		if (is_dir)
>  			clear_nlink(dentry->d_inode);
> @@ -1212,7 +1213,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>  out_unlock:
>  	unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (update_nlink)
>  		ovl_nlink_end(new);
>  out_drop_write:
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 6bcc9dedc342..192f5508ed45 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -64,7 +64,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  		inode_lock(upperdentry->d_inode);
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		err = notify_change(upperdentry, attr, NULL);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  		if (!err)
>  			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
>  		inode_unlock(upperdentry->d_inode);
> @@ -260,7 +260,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>  		stat->nlink = dentry->d_inode->i_nlink;
>  
>  out:
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -303,7 +303,7 @@ int ovl_permission(struct inode *inode, int mask)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	err = inode_permission(realinode, mask);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -320,7 +320,7 @@ static const char *ovl_get_link(struct dentry *dentry,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	p = vfs_get_link(ovl_dentry_real(dentry), done);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return p;
>  }
>  
> @@ -363,7 +363,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>  		WARN_ON(flags != XATTR_REPLACE);
>  		err = vfs_removexattr(realdentry, name);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	/* copy c/mtime */
>  	ovl_copyattr(d_inode(realdentry), inode);
> @@ -384,7 +384,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_getxattr(realdentry, name, value, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return res;
>  }
>  
> @@ -408,7 +408,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	res = vfs_listxattr(realdentry, list, size);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (res <= 0 || size == 0)
>  		return res;
>  
> @@ -443,7 +443,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
>  	acl = get_acl(realinode, type);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return acl;
>  }
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 3ac9dc8f6cc0..c32fa8ed72e6 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1072,7 +1072,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  			goto out_free_oe;
>  	}
>  
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (origin_path) {
>  		dput(origin_path->dentry);
>  		kfree(origin_path);
> @@ -1099,7 +1099,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  	kfree(upperredirect);
>  out:
>  	kfree(d.redirect);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	return ERR_PTR(err);
>  }
>  
> @@ -1153,7 +1153,7 @@ bool ovl_lower_positive(struct dentry *dentry)
>  			dput(this);
>  		}
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return positive;
>  }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 5e45cb3630a0..6f8b6f9ff357 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -208,6 +208,7 @@ int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
> +void ovl_revert_creds(const struct cred *oldcred);
>  struct super_block *ovl_same_sb(struct super_block *sb);
>  int ovl_can_decode_fh(struct super_block *sb);
>  struct dentry *ovl_indexdir(struct super_block *sb);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index ec237035333a..e38eea8104be 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -20,6 +20,7 @@ struct ovl_config {
>  	bool nfs_export;
>  	int xino;
>  	bool metacopy;
> +	bool override_creds;
>  };
>  
>  struct ovl_sb {
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index cc8303a806b4..ec591b49e902 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -289,7 +289,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
>  		}
>  		inode_unlock(dir->d_inode);
>  	}
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  	return err;
>  }
> @@ -921,7 +921,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>  
>  	old_cred = ovl_override_creds(dentry->d_sb);
>  	err = ovl_dir_read_merged(dentry, list, &root);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  	if (err)
>  		return err;
>  
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 0116735cc321..933829ca7d7d 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -56,6 +56,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_xino_auto_def,
>  		 "Auto enable xino feature");
>  
> +static bool __read_mostly ovl_override_creds_def = true;
> +module_param_named(override_creds, ovl_override_creds_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_override_creds_def,
> +		 "Use mounter's credentials for accesses");
> +
>  static void ovl_entry_stack_free(struct ovl_entry *oe)
>  {
>  	unsigned int i;
> @@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>  	if (ofs->config.metacopy != ovl_metacopy_def)
>  		seq_printf(m, ",metacopy=%s",
>  			   ofs->config.metacopy ? "on" : "off");
> +	if (ofs->config.override_creds != ovl_override_creds_def)
> +		seq_show_option(m, "override_creds",
> +				ofs->config.override_creds ? "on" : "off");
>  	return 0;
>  }
>  
> @@ -401,6 +409,8 @@ enum {
>  	OPT_XINO_AUTO,
>  	OPT_METACOPY_ON,
>  	OPT_METACOPY_OFF,
> +	OPT_OVERRIDE_CREDS_ON,
> +	OPT_OVERRIDE_CREDS_OFF,
>  	OPT_ERR,
>  };
>  
> @@ -419,6 +429,8 @@ static const match_table_t ovl_tokens = {
>  	{OPT_XINO_AUTO,			"xino=auto"},
>  	{OPT_METACOPY_ON,		"metacopy=on"},
>  	{OPT_METACOPY_OFF,		"metacopy=off"},
> +	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
> +	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
>  	{OPT_ERR,			NULL}
>  };
>  
> @@ -477,6 +489,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
>  	if (!config->redirect_mode)
>  		return -ENOMEM;
> +	config->override_creds = ovl_override_creds_def;
>  
>  	while ((p = ovl_next_opt(&opt)) != NULL) {
>  		int token;
> @@ -557,6 +570,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  			config->metacopy = false;
>  			break;
>  
> +		case OPT_OVERRIDE_CREDS_ON:
> +			config->override_creds = true;
> +			break;
> +
> +		case OPT_OVERRIDE_CREDS_OFF:
> +			config->override_creds = false;
> +			break;
> +
>  		default:
>  			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>  			return -EINVAL;
> @@ -1549,7 +1570,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  		       ovl_dentry_lower(root_dentry), NULL);
>  
>  	sb->s_root = root_dentry;
> -
>  	return 0;
>  
>  out_free_oe:
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 7c01327b1852..484d7f76ac9c 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -40,9 +40,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
>  
> +	if (!ofs->config.override_creds)
> +		return NULL;
>  	return override_creds(ofs->creator_cred);
>  }
>  
> +void ovl_revert_creds(const struct cred *old_cred)
> +{
> +	if (old_cred)
> +		revert_creds(old_cred);
> +}
> +
>  struct super_block *ovl_same_sb(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
> @@ -782,7 +790,7 @@ int ovl_nlink_start(struct dentry *dentry)
>  	 * value relative to the upper inode nlink in an upper inode xattr.
>  	 */
>  	err = ovl_set_nlink_upper(dentry);
> -	revert_creds(old_cred);
> +	ovl_revert_creds(old_cred);
>  
>  out:
>  	if (err)
> @@ -800,7 +808,7 @@ void ovl_nlink_end(struct dentry *dentry)
>  
>  		old_cred = ovl_override_creds(dentry->d_sb);
>  		ovl_cleanup_index(dentry);
> -		revert_creds(old_cred);
> +		ovl_revert_creds(old_cred);
>  	}
>  
>  	ovl_inode_unlock(inode);
> -- 
> 2.19.1.930.g4563a0d9d0-goog
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-11-14 22:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06 23:01 [PATCH v8 1/2] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh Mark Salyzyn
2018-11-06 23:01 ` [PATCH v8 2/2] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
2018-11-08 15:56   ` Vivek Goyal
2018-11-08 20:01   ` Vivek Goyal
     [not found]     ` <b057da99-adc6-b355-fb57-b314a29f298f@android.com>
2018-11-09  3:05       ` Amir Goldstein
2018-11-09 17:32         ` Mark Salyzyn
2018-11-10  7:51           ` Amir Goldstein
2018-11-12 18:15             ` Mark Salyzyn
2018-11-09 15:20       ` Vivek Goyal
2018-11-14 22:00   ` Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).