linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/2] overlayfs override_creds=off
@ 2019-07-24 19:57 Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 1/5] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh Mark Salyzyn
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-24 19:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc

Patch series:

overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh
Add optional __get xattr method paired to __vfs_getxattr
overlayfs: add __get xattr method
overlayfs: internal getxattr operations without sepolicy checking
overlayfs: override_creds=off option bypass creator_cred

The first four patches address fundamental security issues that should
be solved regardless of the override_creds=off feature.

The fifth that adds the feature depends on these other fixes.

By default, all access to the upper, lower and work directories is the
recorded mounter's MAC and DAC credentials.  The incoming accesses are
checked against the caller's credentials.

If the principles of least privilege are applied for sepolicy, the
mounter's credentials might not overlap the credentials of the caller's
when accessing the overlayfs filesystem.  For example, a file that a
lower DAC privileged caller can execute, is MAC denied to the
generally higher DAC privileged mounter, to prevent an attack vector.

We add the option to turn off override_creds in the mount options; all
subsequent operations after mount on the filesystem will be only the
caller's credentials.  The module boolean parameter and mount option
override_creds is also added as a presence check for this "feature",
existence of /sys/module/overlay/parameters/overlay_creds

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
v10:
- Rebase
- Return NULL on CAP_DAC_READ_SEARCH
- Add __get xattr method to solve sepolicy logging issue
- Drop unnecassary sys_admin sepolicy checking for administrative
  driver internal xattr functions.

v6:
- Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
- Do better with the documentation, drop rationalizations.
- pr_warn message adjusted to report consequences.

v5:
- beefed up the caveats in the Documentation
- Is dependent on
  "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
  "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
- Added prwarn when override_creds=off

v4:
- spelling and grammar errors in text

v3:
- Change name from caller_credentials / creator_credentials to the
  boolean override_creds.
- Changed from creator to mounter credentials.
- Updated and fortified the documentation.
- Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS

v2:
- Forward port changed attr to stat, resulting in a build error.
- altered commit message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v10 1/5] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh
  2019-07-24 19:57 [PATCH v10 0/2] overlayfs override_creds=off Mark Salyzyn
@ 2019-07-24 19:57 ` Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 2/5] Add optional __get xattr method paired to __vfs_getxattr Mark Salyzyn
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-24 19:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc

Assumption never checked, should fail if the mounter creds are not
sufficient.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v10:
- return NULL rather than ERR_PTR(-EPERM)
- did _not_ add it ovl_can_decode_fh() because of changes since last
  review, suspect needs to be added to ovl_lower_uuid_ok()?

v8 + v9:
- rebase

v7:
- This time for realz

v6:
- rebase

v5:
- dependency of "overlayfs: override_creds=off option bypass creator_cred"
---
 fs/overlayfs/namei.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index e9717c2f7d45..9702f0d5309d 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -161,6 +161,9 @@ struct dentry *ovl_decode_real_fh(struct ovl_fh *fh, struct vfsmount *mnt,
 	if (!uuid_equal(&fh->uuid, &mnt->mnt_sb->s_uuid))
 		return NULL;
 
+	if (!capable(CAP_DAC_READ_SEARCH))
+		return NULL;
+
 	bytes = (fh->len - offsetof(struct ovl_fh, fid));
 	real = exportfs_decode_fh(mnt, (struct fid *)fh->fid,
 				  bytes >> 2, (int)fh->type,
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v10 2/5] Add optional __get xattr method paired to __vfs_getxattr
  2019-07-24 19:57 [PATCH v10 0/2] overlayfs override_creds=off Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 1/5] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh Mark Salyzyn
@ 2019-07-24 19:57 ` Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 3/5] overlayfs: add __get xattr method Mark Salyzyn
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-24 19:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, Alexander Viro,
	Mark Salyzyn, linux-fsdevel

Add an optional __get xattr method that would be called, if set, only
in __vfs_getxattr instead of the regular get xattr method.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v10 - added to patch series
---
 fs/xattr.c            | 11 ++++++++++-
 include/linux/xattr.h |  7 +++++--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 90dd78f0eb27..b8f4734e222f 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -306,6 +306,9 @@ __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char *name,
 	handler = xattr_resolve_name(inode, &name);
 	if (IS_ERR(handler))
 		return PTR_ERR(handler);
+	if (unlikely(handler->__get))
+		return handler->__get(handler, dentry, inode, name, value,
+				      size);
 	if (!handler->get)
 		return -EOPNOTSUPP;
 	return handler->get(handler, dentry, inode, name, value, size);
@@ -317,6 +320,7 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 {
 	struct inode *inode = dentry->d_inode;
 	int error;
+	const struct xattr_handler *handler;
 
 	error = xattr_permission(inode, name, MAY_READ);
 	if (error)
@@ -339,7 +343,12 @@ vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
 		return ret;
 	}
 nolsm:
-	return __vfs_getxattr(dentry, inode, name, value, size);
+	handler = xattr_resolve_name(inode, &name);
+	if (IS_ERR(handler))
+		return PTR_ERR(handler);
+	if (!handler->get)
+		return -EOPNOTSUPP;
+	return handler->get(handler, dentry, inode, name, value, size);
 }
 EXPORT_SYMBOL_GPL(vfs_getxattr);
 
diff --git a/include/linux/xattr.h b/include/linux/xattr.h
index 6dad031be3c2..30f25e1ac571 100644
--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -30,10 +30,13 @@ struct xattr_handler {
 	const char *prefix;
 	int flags;      /* fs private flags */
 	bool (*list)(struct dentry *dentry);
-	int (*get)(const struct xattr_handler *, struct dentry *dentry,
+	int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
 		   struct inode *inode, const char *name, void *buffer,
 		   size_t size);
-	int (*set)(const struct xattr_handler *, struct dentry *dentry,
+	int (*__get)(const struct xattr_handler *handler, struct dentry *dentry,
+		     struct inode *inode, const char *name, void *buffer,
+		     size_t size);
+	int (*set)(const struct xattr_handler *handler, struct dentry *dentry,
 		   struct inode *inode, const char *name, const void *buffer,
 		   size_t size, int flags);
 };
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-24 19:57 [PATCH v10 0/2] overlayfs override_creds=off Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 1/5] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 2/5] Add optional __get xattr method paired to __vfs_getxattr Mark Salyzyn
@ 2019-07-24 19:57 ` Mark Salyzyn
  2019-07-25  5:48   ` Amir Goldstein
  2019-07-24 19:57 ` [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking Mark Salyzyn
  2019-07-24 19:57 ` [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
  4 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-24 19:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc

Because of the overlayfs getxattr recursion, the incoming inode fails
to update the selinux sid resulting in avc denials being reported
against a target context of u:object_r:unlabeled:s0.

Solution is to add a _get xattr method that calls the __vfs_getxattr
handler so that the context can be read in, rather than being denied
with an -EACCES when vfs_getxattr handler is called.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v10 - added to patch series
---
 fs/overlayfs/inode.c     | 15 +++++++++++++++
 fs/overlayfs/overlayfs.h |  2 ++
 fs/overlayfs/super.c     | 18 ++++++++++++++++++
 3 files changed, 35 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 7663aeb85fa3..d3b53849615c 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -362,6 +362,21 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
 	return err;
 }
 
+int __ovl_xattr_get(struct dentry *dentry, struct inode *inode,
+		    const char *name, void *value, size_t size)
+{
+	ssize_t res;
+	const struct cred *old_cred;
+	struct dentry *realdentry =
+		ovl_i_dentry_upper(inode) ?: ovl_dentry_lower(dentry);
+
+	old_cred = ovl_override_creds(dentry->d_sb);
+	res = __vfs_getxattr(realdentry, d_inode(realdentry), name, value,
+			     size);
+	ovl_revert_creds(old_cred);
+	return res;
+}
+
 int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 		  void *value, size_t size)
 {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 6934bcf030f0..73a02a263fbc 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -357,6 +357,8 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
 		  const void *value, size_t size, int flags);
 int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 		  void *value, size_t size);
+int __ovl_xattr_get(struct dentry *dentry, struct inode *inode,
+		    const char *name, void *value, size_t size);
 ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
 struct posix_acl *ovl_get_acl(struct inode *inode, int type);
 int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index b368e2e102fa..82e1130de206 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -859,6 +859,14 @@ ovl_posix_acl_xattr_get(const struct xattr_handler *handler,
 	return ovl_xattr_get(dentry, inode, handler->name, buffer, size);
 }
 
+static int __maybe_unused
+__ovl_posix_acl_xattr_get(const struct xattr_handler *handler,
+			  struct dentry *dentry, struct inode *inode,
+			  const char *name, void *buffer, size_t size)
+{
+	return __ovl_xattr_get(dentry, inode, handler->name, buffer, size);
+}
+
 static int __maybe_unused
 ovl_posix_acl_xattr_set(const struct xattr_handler *handler,
 			struct dentry *dentry, struct inode *inode,
@@ -939,6 +947,13 @@ static int ovl_other_xattr_get(const struct xattr_handler *handler,
 	return ovl_xattr_get(dentry, inode, name, buffer, size);
 }
 
+static int __ovl_other_xattr_get(const struct xattr_handler *handler,
+				 struct dentry *dentry, struct inode *inode,
+				 const char *name, void *buffer, size_t size)
+{
+	return __ovl_xattr_get(dentry, inode, name, buffer, size);
+}
+
 static int ovl_other_xattr_set(const struct xattr_handler *handler,
 			       struct dentry *dentry, struct inode *inode,
 			       const char *name, const void *value,
@@ -952,6 +967,7 @@ ovl_posix_acl_access_xattr_handler = {
 	.name = XATTR_NAME_POSIX_ACL_ACCESS,
 	.flags = ACL_TYPE_ACCESS,
 	.get = ovl_posix_acl_xattr_get,
+	.__get = __ovl_posix_acl_xattr_get,
 	.set = ovl_posix_acl_xattr_set,
 };
 
@@ -960,6 +976,7 @@ ovl_posix_acl_default_xattr_handler = {
 	.name = XATTR_NAME_POSIX_ACL_DEFAULT,
 	.flags = ACL_TYPE_DEFAULT,
 	.get = ovl_posix_acl_xattr_get,
+	.__get = __ovl_posix_acl_xattr_get,
 	.set = ovl_posix_acl_xattr_set,
 };
 
@@ -972,6 +989,7 @@ static const struct xattr_handler ovl_own_xattr_handler = {
 static const struct xattr_handler ovl_other_xattr_handler = {
 	.prefix	= "", /* catch all */
 	.get = ovl_other_xattr_get,
+	.__get = __ovl_other_xattr_get,
 	.set = ovl_other_xattr_set,
 };
 
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking
  2019-07-24 19:57 [PATCH v10 0/2] overlayfs override_creds=off Mark Salyzyn
                   ` (2 preceding siblings ...)
  2019-07-24 19:57 ` [PATCH v10 3/5] overlayfs: add __get xattr method Mark Salyzyn
@ 2019-07-24 19:57 ` Mark Salyzyn
  2019-07-25 11:00   ` Amir Goldstein
  2019-07-24 19:57 ` [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
  4 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-24 19:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc

Check impure, opaque, origin & meta xattr with no sepolicy audit
(using __vfs_getxattr) since these operations are internal to
overlayfs operations and do not disclose any data.  This became
an issue for credential override off since sys_admin would have
been required by the caller; whereas would have been inherently
present for the creator since it performed the mount.

This is a change in operations since we do not check in the new
ovl_vfs_getxattr function if the credential override is off or
not.  Reasoning is that the sepolicy check is unnecessary overhead,
especially since the check can be expensive.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v10 - added to patch series
---
 fs/overlayfs/namei.c     | 12 +++++++-----
 fs/overlayfs/overlayfs.h |  2 ++
 fs/overlayfs/util.c      | 24 +++++++++++++++---------
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 9702f0d5309d..fb6c0cd7b65f 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -106,10 +106,11 @@ int ovl_check_fh_len(struct ovl_fh *fh, int fh_len)
 
 static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
 {
-	int res, err;
+	ssize_t res;
+	int err;
 	struct ovl_fh *fh = NULL;
 
-	res = vfs_getxattr(dentry, name, NULL, 0);
+	res = ovl_vfs_getxattr(dentry, name, NULL, 0);
 	if (res < 0) {
 		if (res == -ENODATA || res == -EOPNOTSUPP)
 			return NULL;
@@ -123,7 +124,7 @@ static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
 	if (!fh)
 		return ERR_PTR(-ENOMEM);
 
-	res = vfs_getxattr(dentry, name, fh, res);
+	res = ovl_vfs_getxattr(dentry, name, fh, res);
 	if (res < 0)
 		goto fail;
 
@@ -141,10 +142,11 @@ static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
 	return NULL;
 
 fail:
-	pr_warn_ratelimited("overlayfs: failed to get origin (%i)\n", res);
+	pr_warn_ratelimited("overlayfs: failed to get origin (%zi)\n", res);
 	goto out;
 invalid:
-	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n", res, fh);
+	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n",
+			    (int)res, fh);
 	goto out;
 }
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 73a02a263fbc..82574684a9b6 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -205,6 +205,8 @@ int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
+ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
+			 size_t size);
 struct super_block *ovl_same_sb(struct super_block *sb);
 int ovl_can_decode_fh(struct super_block *sb);
 struct dentry *ovl_indexdir(struct super_block *sb);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f5678a3f8350..672459c3cff7 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -40,6 +40,12 @@ const struct cred *ovl_override_creds(struct super_block *sb)
 	return override_creds(ofs->creator_cred);
 }
 
+ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
+			 size_t size)
+{
+	return __vfs_getxattr(dentry, d_inode(dentry), name, buf, size);
+}
+
 struct super_block *ovl_same_sb(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
@@ -537,9 +543,9 @@ void ovl_copy_up_end(struct dentry *dentry)
 
 bool ovl_check_origin_xattr(struct dentry *dentry)
 {
-	int res;
+	ssize_t res;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_ORIGIN, NULL, 0);
+	res = ovl_vfs_getxattr(dentry, OVL_XATTR_ORIGIN, NULL, 0);
 
 	/* Zero size value means "copied up but origin unknown" */
 	if (res >= 0)
@@ -550,13 +556,13 @@ bool ovl_check_origin_xattr(struct dentry *dentry)
 
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name)
 {
-	int res;
+	ssize_t res;
 	char val;
 
 	if (!d_is_dir(dentry))
 		return false;
 
-	res = vfs_getxattr(dentry, name, &val, 1);
+	res = ovl_vfs_getxattr(dentry, name, &val, 1);
 	if (res == 1 && val == 'y')
 		return true;
 
@@ -837,13 +843,13 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
 /* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
 int ovl_check_metacopy_xattr(struct dentry *dentry)
 {
-	int res;
+	ssize_t res;
 
 	/* Only regular files can have metacopy xattr */
 	if (!S_ISREG(d_inode(dentry)->i_mode))
 		return 0;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	res = ovl_vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
 	if (res < 0) {
 		if (res == -ENODATA || res == -EOPNOTSUPP)
 			return 0;
@@ -852,7 +858,7 @@ int ovl_check_metacopy_xattr(struct dentry *dentry)
 
 	return 1;
 out:
-	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%zi)\n", res);
 	return res;
 }
 
@@ -878,7 +884,7 @@ ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value,
 	ssize_t res;
 	char *buf = NULL;
 
-	res = vfs_getxattr(dentry, name, NULL, 0);
+	res = ovl_vfs_getxattr(dentry, name, NULL, 0);
 	if (res < 0) {
 		if (res == -ENODATA || res == -EOPNOTSUPP)
 			return -ENODATA;
@@ -890,7 +896,7 @@ ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value,
 		if (!buf)
 			return -ENOMEM;
 
-		res = vfs_getxattr(dentry, name, buf, res);
+		res = ovl_vfs_getxattr(dentry, name, buf, res);
 		if (res < 0)
 			goto fail;
 	}
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred
  2019-07-24 19:57 [PATCH v10 0/2] overlayfs override_creds=off Mark Salyzyn
                   ` (3 preceding siblings ...)
  2019-07-24 19:57 ` [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking Mark Salyzyn
@ 2019-07-24 19:57 ` Mark Salyzyn
  2019-07-25  6:14   ` Amir Goldstein
  4 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-24 19:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc

By default, all access to the upper, lower and work directories is the
recorded mounter's MAC and DAC credentials.  The incoming accesses are
checked against the caller's credentials.

If the principles of least privilege are applied, the mounter's
credentials might not overlap the credentials of the caller's when
accessing the overlayfs filesystem.  For example, a file that a lower
DAC privileged caller can execute, is MAC denied to the generally
higher DAC privileged mounter, to prevent an attack vector.

We add the option to turn off override_creds in the mount options; all
subsequent operations after mount on the filesystem will be only the
caller's credentials.  The module boolean parameter and mount option
override_creds is also added as a presence check for this "feature",
existence of /sys/module/overlay/parameters/override_creds.

It was not always this way.  Circa 4.6 there was no recorded mounter's
credentials, instead privileged access to upper or work directories
were temporarily increased to perform the operations.  The MAC
(selinux) policies were caller's in all cases.  override_creds=off
partially returns us to this older access model minus the insecure
temporary credential increases.  This is to permit use in a system
with non-overlapping security models for each executable including
the agent that mounts the overlayfs filesystem.  In Android
this is the case since init, which performs the mount operations,
has a minimal MAC set of privileges to reduce any attack surface,
and services that use the content have a different set of MAC
privileges (eg: read, for vendor labelled configuration, execute for
vendor libraries and modules).  The caveats are not a problem in
the Android usage model, however they should be fixed for
completeness and for general use in time.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
---
v10:
- Rebase (and expand because of increased revert_cred usage)

v9:
- Add to the caveats

v8:
- drop pr_warn message after straw poll to remove it.
- added a use case in the commit message

v7:
- change name of internal parameter to ovl_override_creds_def
- report override_creds only if different than default

v6:
- Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
- Do better with the documentation.
- pr_warn message adjusted to report consequences.

v5:
- beefed up the caveats in the Documentation
- Is dependent on
  "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
  "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
- Added prwarn when override_creds=off

v4:
- spelling and grammar errors in text

v3:
- Change name from caller_credentials / creator_credentials to the
  boolean override_creds.
- Changed from creator to mounter credentials.
- Updated and fortified the documentation.
- Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS

v2:
- Forward port changed attr to stat, resulting in a build error.
- altered commit message.

a
---
 Documentation/filesystems/overlayfs.txt | 23 +++++++++++++++++++++++
 fs/overlayfs/copy_up.c                  |  2 +-
 fs/overlayfs/dir.c                      | 11 ++++++-----
 fs/overlayfs/file.c                     | 20 ++++++++++----------
 fs/overlayfs/inode.c                    | 18 +++++++++---------
 fs/overlayfs/namei.c                    |  6 +++---
 fs/overlayfs/overlayfs.h                |  1 +
 fs/overlayfs/ovl_entry.h                |  1 +
 fs/overlayfs/readdir.c                  |  4 ++--
 fs/overlayfs/super.c                    | 22 +++++++++++++++++++++-
 fs/overlayfs/util.c                     | 12 ++++++++++--
 11 files changed, 87 insertions(+), 33 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 1da2f1668f08..d48125076602 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -102,6 +102,29 @@ Only the lists of names from directories are merged.  Other content
 such as metadata and extended attributes are reported for the upper
 directory only.  These attributes of the lower directory are hidden.
 
+credentials
+-----------
+
+By default, all access to the upper, lower and work directories is the
+recorded mounter's MAC and DAC credentials.  The incoming accesses are
+checked against the caller's credentials.
+
+In the case where caller MAC or DAC credentials do not overlap, a
+use case available in older versions of the driver, the
+override_creds mount flag can be turned off and help when the use
+pattern has caller with legitimate credentials where the mounter
+does not.  Several unintended side effects will occur though.  The
+caller without certain key capabilities or lower privilege will not
+always be able to delete files or directories, create nodes, or
+search some restricted directories.  The ability to search and read
+a directory entry is spotty as a result of the cache mechanism not
+retesting the credentials because of the assumption, a privileged
+caller can fill cache, then a lower privilege can read the directory
+cache.  The uneven security model where cache, upperdir and workdir
+are opened at privilege, but accessed without creating a form of
+privilege escalation, should only be used with strict understanding
+of the side effects and of the security policies.
+
 whiteouts and opaque directories
 --------------------------------
 
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index b801c6353100..1311ab4aea00 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -886,7 +886,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
 		dput(parent);
 		dput(next);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 702aa63f6774..c4b061c3a6ef 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -563,7 +563,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 		override_cred->fsgid = inode->i_gid;
 		if (!attr->hardlink) {
 			err = security_dentry_create_files_as(dentry,
-					attr->mode, &dentry->d_name, old_cred,
+					attr->mode, &dentry->d_name,
+					old_cred ? old_cred : current_cred(),
 					override_cred);
 			if (err) {
 				put_cred(override_cred);
@@ -579,7 +580,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
 			err = ovl_create_over_whiteout(dentry, inode, attr);
 	}
 out_revert_creds:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return err;
 }
 
@@ -655,7 +656,7 @@ static int ovl_set_link_redirect(struct dentry *dentry)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	err = ovl_set_redirect(dentry, false);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -851,7 +852,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 		err = ovl_remove_upper(dentry, is_dir, &list);
 	else
 		err = ovl_remove_and_whiteout(dentry, &list);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (!err) {
 		if (is_dir)
 			clear_nlink(dentry->d_inode);
@@ -1221,7 +1222,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 out_unlock:
 	unlock_rename(new_upperdir, old_upperdir);
 out_revert_creds:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (update_nlink)
 		ovl_nlink_end(new);
 out_drop_write:
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index e235a635d9ec..39a50fad9f7f 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -32,7 +32,7 @@ static struct file *ovl_open_realfile(const struct file *file,
 	old_cred = ovl_override_creds(inode->i_sb);
 	realfile = open_with_fake_path(&file->f_path, flags, realinode,
 				       current_cred());
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
 		 file, file, ovl_whatisit(inode, realinode), file->f_flags,
@@ -176,7 +176,7 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
 
 	old_cred = ovl_override_creds(inode->i_sb);
 	ret = vfs_llseek(real.file, offset, whence);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	file->f_pos = real.file->f_pos;
 	inode_unlock(inode);
@@ -242,7 +242,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = vfs_iter_read(real.file, iter, &iocb->ki_pos,
 			    ovl_iocb_to_rwf(iocb));
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	ovl_file_accessed(file);
 
@@ -278,7 +278,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 	ret = vfs_iter_write(real.file, iter, &iocb->ki_pos,
 			     ovl_iocb_to_rwf(iocb));
 	file_end_write(real.file);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	/* Update size */
 	ovl_copyattr(ovl_inode_real(inode), inode);
@@ -305,7 +305,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	if (file_inode(real.file) == ovl_inode_upper(file_inode(file))) {
 		old_cred = ovl_override_creds(file_inode(file)->i_sb);
 		ret = vfs_fsync_range(real.file, start, end, datasync);
-		revert_creds(old_cred);
+		ovl_revert_creds(old_cred);
 	}
 
 	fdput(real);
@@ -329,7 +329,7 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = call_mmap(vma->vm_file, vma);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	if (ret) {
 		/* Drop reference count from new vm_file value */
@@ -357,7 +357,7 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
 
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = vfs_fallocate(real.file, mode, offset, len);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	/* Update size */
 	ovl_copyattr(ovl_inode_real(inode), inode);
@@ -379,7 +379,7 @@ static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
 
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = vfs_fadvise(real.file, offset, len, advice);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	fdput(real);
 
@@ -399,7 +399,7 @@ static long ovl_real_ioctl(struct file *file, unsigned int cmd,
 
 	old_cred = ovl_override_creds(file_inode(file)->i_sb);
 	ret = vfs_ioctl(real.file, cmd, arg);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	fdput(real);
 
@@ -589,7 +589,7 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
 						flags);
 		break;
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	/* Update size */
 	ovl_copyattr(ovl_inode_real(inode_out), inode_out);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d3b53849615c..6c11c7af5157 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -61,7 +61,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		inode_lock(upperdentry->d_inode);
 		old_cred = ovl_override_creds(dentry->d_sb);
 		err = notify_change(upperdentry, attr, NULL);
-		revert_creds(old_cred);
+		ovl_revert_creds(old_cred);
 		if (!err)
 			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
 		inode_unlock(upperdentry->d_inode);
@@ -257,7 +257,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 		stat->nlink = dentry->d_inode->i_nlink;
 
 out:
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -291,7 +291,7 @@ int ovl_permission(struct inode *inode, int mask)
 		mask |= MAY_READ;
 	}
 	err = inode_permission(realinode, mask);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -308,7 +308,7 @@ static const char *ovl_get_link(struct dentry *dentry,
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	p = vfs_get_link(ovl_dentry_real(dentry), done);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return p;
 }
 
@@ -351,7 +351,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
 		WARN_ON(flags != XATTR_REPLACE);
 		err = vfs_removexattr(realdentry, name);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	/* copy c/mtime */
 	ovl_copyattr(d_inode(realdentry), inode);
@@ -387,7 +387,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	res = vfs_getxattr(realdentry, name, value, size);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return res;
 }
 
@@ -411,7 +411,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	res = vfs_listxattr(realdentry, list, size);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (res <= 0 || size == 0)
 		return res;
 
@@ -446,7 +446,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
 
 	old_cred = ovl_override_creds(inode->i_sb);
 	acl = get_acl(realinode, type);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return acl;
 }
@@ -484,7 +484,7 @@ static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		filemap_write_and_wait(realinode->i_mapping);
 
 	err = realinode->i_op->fiemap(realinode, fieinfo, start, len);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index fb6c0cd7b65f..12627018b00a 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -1079,7 +1079,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			goto out_free_oe;
 	}
 
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (origin_path) {
 		dput(origin_path->dentry);
 		kfree(origin_path);
@@ -1106,7 +1106,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	kfree(upperredirect);
 out:
 	kfree(d.redirect);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	return ERR_PTR(err);
 }
 
@@ -1160,7 +1160,7 @@ bool ovl_lower_positive(struct dentry *dentry)
 			dput(this);
 		}
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return positive;
 }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 82574684a9b6..cdbdb533d3bd 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -205,6 +205,7 @@ int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
+void ovl_revert_creds(const struct cred *oldcred);
 ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
 			 size_t size);
 struct super_block *ovl_same_sb(struct super_block *sb);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 28a2d12a1029..2637c5aadf7f 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -17,6 +17,7 @@ struct ovl_config {
 	bool nfs_export;
 	int xino;
 	bool metacopy;
+	bool override_creds;
 };
 
 struct ovl_sb {
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 47a91c9733a5..f31ef39e5afa 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -286,7 +286,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
 		}
 		inode_unlock(dir->d_inode);
 	}
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 	return err;
 }
@@ -918,7 +918,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	err = ovl_dir_read_merged(dentry, list, &root);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 	if (err)
 		return err;
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 82e1130de206..c2ddce5d488c 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -53,6 +53,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
 MODULE_PARM_DESC(xino_auto,
 		 "Auto enable xino feature");
 
+static bool __read_mostly ovl_override_creds_def = true;
+module_param_named(override_creds, ovl_override_creds_def, bool, 0644);
+MODULE_PARM_DESC(ovl_override_creds_def,
+		 "Use mounter's credentials for accesses");
+
 static void ovl_entry_stack_free(struct ovl_entry *oe)
 {
 	unsigned int i;
@@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ofs->config.metacopy != ovl_metacopy_def)
 		seq_printf(m, ",metacopy=%s",
 			   ofs->config.metacopy ? "on" : "off");
+	if (ofs->config.override_creds != ovl_override_creds_def)
+		seq_show_option(m, "override_creds",
+				ofs->config.override_creds ? "on" : "off");
 	return 0;
 }
 
@@ -402,6 +410,8 @@ enum {
 	OPT_XINO_AUTO,
 	OPT_METACOPY_ON,
 	OPT_METACOPY_OFF,
+	OPT_OVERRIDE_CREDS_ON,
+	OPT_OVERRIDE_CREDS_OFF,
 	OPT_ERR,
 };
 
@@ -420,6 +430,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_AUTO,			"xino=auto"},
 	{OPT_METACOPY_ON,		"metacopy=on"},
 	{OPT_METACOPY_OFF,		"metacopy=off"},
+	{OPT_OVERRIDE_CREDS_ON,		"override_creds=on"},
+	{OPT_OVERRIDE_CREDS_OFF,	"override_creds=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -478,6 +490,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 	config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
 	if (!config->redirect_mode)
 		return -ENOMEM;
+	config->override_creds = ovl_override_creds_def;
 
 	while ((p = ovl_next_opt(&opt)) != NULL) {
 		int token;
@@ -558,6 +571,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->metacopy = false;
 			break;
 
+		case OPT_OVERRIDE_CREDS_ON:
+			config->override_creds = true;
+			break;
+
+		case OPT_OVERRIDE_CREDS_OFF:
+			config->override_creds = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -1690,7 +1711,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		       ovl_dentry_lower(root_dentry), NULL);
 
 	sb->s_root = root_dentry;
-
 	return 0;
 
 out_free_oe:
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 672459c3cff7..320aad599bcd 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -37,9 +37,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
 
+	if (!ofs->config.override_creds)
+		return NULL;
 	return override_creds(ofs->creator_cred);
 }
 
+void ovl_revert_creds(const struct cred *old_cred)
+{
+	if (old_cred)
+		revert_creds(old_cred);
+}
+
 ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
 			 size_t size)
 {
@@ -797,7 +805,7 @@ int ovl_nlink_start(struct dentry *dentry)
 	 * value relative to the upper inode nlink in an upper inode xattr.
 	 */
 	err = ovl_set_nlink_upper(dentry);
-	revert_creds(old_cred);
+	ovl_revert_creds(old_cred);
 
 out:
 	if (err)
@@ -815,7 +823,7 @@ void ovl_nlink_end(struct dentry *dentry)
 
 		old_cred = ovl_override_creds(dentry->d_sb);
 		ovl_cleanup_index(dentry);
-		revert_creds(old_cred);
+		ovl_revert_creds(old_cred);
 	}
 
 	ovl_inode_unlock(inode);
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-24 19:57 ` [PATCH v10 3/5] overlayfs: add __get xattr method Mark Salyzyn
@ 2019-07-25  5:48   ` Amir Goldstein
  2019-07-25 15:03     ` Mark Salyzyn
  0 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2019-07-25  5:48 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> Because of the overlayfs getxattr recursion, the incoming inode fails
> to update the selinux sid resulting in avc denials being reported
> against a target context of u:object_r:unlabeled:s0.

This description is too brief for me to understand the root problem.
What's wring with the overlayfs getxattr recursion w.r.t the selinux
security model?

Please give an example of your unprivileged mounter use case
to explain.

CC Vivek because I could really never understand all this.

>
> Solution is to add a _get xattr method that calls the __vfs_getxattr
> handler so that the context can be read in, rather than being denied
> with an -EACCES when vfs_getxattr handler is called.
>
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@android.com
> ---
> v10 - added to patch series
> ---
>  fs/overlayfs/inode.c     | 15 +++++++++++++++
>  fs/overlayfs/overlayfs.h |  2 ++
>  fs/overlayfs/super.c     | 18 ++++++++++++++++++
>  3 files changed, 35 insertions(+)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 7663aeb85fa3..d3b53849615c 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -362,6 +362,21 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>         return err;
>  }
>
> +int __ovl_xattr_get(struct dentry *dentry, struct inode *inode,
> +                   const char *name, void *value, size_t size)
> +{
> +       ssize_t res;
> +       const struct cred *old_cred;
> +       struct dentry *realdentry =
> +               ovl_i_dentry_upper(inode) ?: ovl_dentry_lower(dentry);
> +
> +       old_cred = ovl_override_creds(dentry->d_sb);
> +       res = __vfs_getxattr(realdentry, d_inode(realdentry), name, value,
> +                            size);
> +       ovl_revert_creds(old_cred);
> +       return res;
> +}
> +
>  int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>                   void *value, size_t size)
>  {
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 6934bcf030f0..73a02a263fbc 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -357,6 +357,8 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>                   const void *value, size_t size, int flags);
>  int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>                   void *value, size_t size);
> +int __ovl_xattr_get(struct dentry *dentry, struct inode *inode,
> +                   const char *name, void *value, size_t size);
>  ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
>  struct posix_acl *ovl_get_acl(struct inode *inode, int type);
>  int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags);
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index b368e2e102fa..82e1130de206 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -859,6 +859,14 @@ ovl_posix_acl_xattr_get(const struct xattr_handler *handler,
>         return ovl_xattr_get(dentry, inode, handler->name, buffer, size);
>  }
>
> +static int __maybe_unused
> +__ovl_posix_acl_xattr_get(const struct xattr_handler *handler,
> +                         struct dentry *dentry, struct inode *inode,
> +                         const char *name, void *buffer, size_t size)
> +{
> +       return __ovl_xattr_get(dentry, inode, handler->name, buffer, size);
> +}
> +
>  static int __maybe_unused
>  ovl_posix_acl_xattr_set(const struct xattr_handler *handler,
>                         struct dentry *dentry, struct inode *inode,
> @@ -939,6 +947,13 @@ static int ovl_other_xattr_get(const struct xattr_handler *handler,
>         return ovl_xattr_get(dentry, inode, name, buffer, size);
>  }
>
> +static int __ovl_other_xattr_get(const struct xattr_handler *handler,
> +                                struct dentry *dentry, struct inode *inode,
> +                                const char *name, void *buffer, size_t size)
> +{
> +       return __ovl_xattr_get(dentry, inode, name, buffer, size);
> +}
> +
>  static int ovl_other_xattr_set(const struct xattr_handler *handler,
>                                struct dentry *dentry, struct inode *inode,
>                                const char *name, const void *value,
> @@ -952,6 +967,7 @@ ovl_posix_acl_access_xattr_handler = {
>         .name = XATTR_NAME_POSIX_ACL_ACCESS,
>         .flags = ACL_TYPE_ACCESS,
>         .get = ovl_posix_acl_xattr_get,
> +       .__get = __ovl_posix_acl_xattr_get,
>         .set = ovl_posix_acl_xattr_set,
>  };
>
> @@ -960,6 +976,7 @@ ovl_posix_acl_default_xattr_handler = {
>         .name = XATTR_NAME_POSIX_ACL_DEFAULT,
>         .flags = ACL_TYPE_DEFAULT,
>         .get = ovl_posix_acl_xattr_get,
> +       .__get = __ovl_posix_acl_xattr_get,
>         .set = ovl_posix_acl_xattr_set,
>  };
>
> @@ -972,6 +989,7 @@ static const struct xattr_handler ovl_own_xattr_handler = {
>  static const struct xattr_handler ovl_other_xattr_handler = {
>         .prefix = "", /* catch all */
>         .get = ovl_other_xattr_get,
> +       .__get = __ovl_other_xattr_get,
>         .set = ovl_other_xattr_set,
>  };
>


Not very professional of me to comment on the proposed solution
without understanding the problem, but my nose says this cannot
be the right solution and if it is, then you better find a much better
name for the API then __get() and document it properly.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred
  2019-07-24 19:57 ` [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
@ 2019-07-25  6:14   ` Amir Goldstein
  2019-07-25 14:38     ` Mark Salyzyn
  0 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2019-07-25  6:14 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> By default, all access to the upper, lower and work directories is the
> recorded mounter's MAC and DAC credentials.  The incoming accesses are
> checked against the caller's credentials.
>
> If the principles of least privilege are applied, the mounter's
> credentials might not overlap the credentials of the caller's when
> accessing the overlayfs filesystem.  For example, a file that a lower
> DAC privileged caller can execute, is MAC denied to the generally
> higher DAC privileged mounter, to prevent an attack vector.
>
> We add the option to turn off override_creds in the mount options; all
> subsequent operations after mount on the filesystem will be only the
> caller's credentials.  The module boolean parameter and mount option
> override_creds is also added as a presence check for this "feature",
> existence of /sys/module/overlay/parameters/override_creds.
>
> It was not always this way.  Circa 4.6 there was no recorded mounter's
> credentials, instead privileged access to upper or work directories
> were temporarily increased to perform the operations.  The MAC
> (selinux) policies were caller's in all cases.  override_creds=off
> partially returns us to this older access model minus the insecure
> temporary credential increases.  This is to permit use in a system
> with non-overlapping security models for each executable including
> the agent that mounts the overlayfs filesystem.  In Android
> this is the case since init, which performs the mount operations,
> has a minimal MAC set of privileges to reduce any attack surface,
> and services that use the content have a different set of MAC
> privileges (eg: read, for vendor labelled configuration, execute for
> vendor libraries and modules).  The caveats are not a problem in
> the Android usage model, however they should be fixed for
> completeness and for general use in time.
>
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@android.com
> ---
> v10:
> - Rebase (and expand because of increased revert_cred usage)
>
> v9:
> - Add to the caveats
>
> v8:
> - drop pr_warn message after straw poll to remove it.
> - added a use case in the commit message
>
> v7:
> - change name of internal parameter to ovl_override_creds_def
> - report override_creds only if different than default
>
> v6:
> - Drop CONFIG_OVERLAY_FS_OVERRIDE_CREDS.
> - Do better with the documentation.
> - pr_warn message adjusted to report consequences.
>
> v5:
> - beefed up the caveats in the Documentation
> - Is dependent on
>   "overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh"
>   "overlayfs: check CAP_MKNOD before issuing vfs_whiteout"
> - Added prwarn when override_creds=off
>
> v4:
> - spelling and grammar errors in text
>
> v3:
> - Change name from caller_credentials / creator_credentials to the
>   boolean override_creds.
> - Changed from creator to mounter credentials.
> - Updated and fortified the documentation.
> - Added CONFIG_OVERLAY_FS_OVERRIDE_CREDS
>
> v2:
> - Forward port changed attr to stat, resulting in a build error.
> - altered commit message.
>
> a
> ---
>  Documentation/filesystems/overlayfs.txt | 23 +++++++++++++++++++++++
>  fs/overlayfs/copy_up.c                  |  2 +-
>  fs/overlayfs/dir.c                      | 11 ++++++-----
>  fs/overlayfs/file.c                     | 20 ++++++++++----------
>  fs/overlayfs/inode.c                    | 18 +++++++++---------
>  fs/overlayfs/namei.c                    |  6 +++---
>  fs/overlayfs/overlayfs.h                |  1 +
>  fs/overlayfs/ovl_entry.h                |  1 +
>  fs/overlayfs/readdir.c                  |  4 ++--
>  fs/overlayfs/super.c                    | 22 +++++++++++++++++++++-
>  fs/overlayfs/util.c                     | 12 ++++++++++--
>  11 files changed, 87 insertions(+), 33 deletions(-)
>
> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
> index 1da2f1668f08..d48125076602 100644
> --- a/Documentation/filesystems/overlayfs.txt
> +++ b/Documentation/filesystems/overlayfs.txt
> @@ -102,6 +102,29 @@ Only the lists of names from directories are merged.  Other content
>  such as metadata and extended attributes are reported for the upper
>  directory only.  These attributes of the lower directory are hidden.
>
> +credentials
> +-----------
> +
> +By default, all access to the upper, lower and work directories is the
> +recorded mounter's MAC and DAC credentials.  The incoming accesses are
> +checked against the caller's credentials.
> +
> +In the case where caller MAC or DAC credentials do not overlap, a
> +use case available in older versions of the driver, the
> +override_creds mount flag can be turned off and help when the use
> +pattern has caller with legitimate credentials where the mounter
> +does not.  Several unintended side effects will occur though.  The
> +caller without certain key capabilities or lower privilege will not
> +always be able to delete files or directories, create nodes, or
> +search some restricted directories.  The ability to search and read
> +a directory entry is spotty as a result of the cache mechanism not
> +retesting the credentials because of the assumption, a privileged
> +caller can fill cache, then a lower privilege can read the directory
> +cache.  The uneven security model where cache, upperdir and workdir
> +are opened at privilege, but accessed without creating a form of
> +privilege escalation, should only be used with strict understanding
> +of the side effects and of the security policies.
> +
>  whiteouts and opaque directories
>  --------------------------------
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index b801c6353100..1311ab4aea00 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -886,7 +886,7 @@ int ovl_copy_up_flags(struct dentry *dentry, int flags)
>                 dput(parent);
>                 dput(next);
>         }
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return err;
>  }
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 702aa63f6774..c4b061c3a6ef 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -563,7 +563,8 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>                 override_cred->fsgid = inode->i_gid;
>                 if (!attr->hardlink) {
>                         err = security_dentry_create_files_as(dentry,
> -                                       attr->mode, &dentry->d_name, old_cred,
> +                                       attr->mode, &dentry->d_name,
> +                                       old_cred ? old_cred : current_cred(),
>                                         override_cred);
>                         if (err) {
>                                 put_cred(override_cred);
> @@ -579,7 +580,7 @@ static int ovl_create_or_link(struct dentry *dentry, struct inode *inode,
>                         err = ovl_create_over_whiteout(dentry, inode, attr);
>         }
>  out_revert_creds:
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         return err;
>  }
>
> @@ -655,7 +656,7 @@ static int ovl_set_link_redirect(struct dentry *dentry)
>
>         old_cred = ovl_override_creds(dentry->d_sb);
>         err = ovl_set_redirect(dentry, false);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return err;
>  }
> @@ -851,7 +852,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
>                 err = ovl_remove_upper(dentry, is_dir, &list);
>         else
>                 err = ovl_remove_and_whiteout(dentry, &list);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         if (!err) {
>                 if (is_dir)
>                         clear_nlink(dentry->d_inode);
> @@ -1221,7 +1222,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
>  out_unlock:
>         unlock_rename(new_upperdir, old_upperdir);
>  out_revert_creds:
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         if (update_nlink)
>                 ovl_nlink_end(new);
>  out_drop_write:
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index e235a635d9ec..39a50fad9f7f 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -32,7 +32,7 @@ static struct file *ovl_open_realfile(const struct file *file,
>         old_cred = ovl_override_creds(inode->i_sb);
>         realfile = open_with_fake_path(&file->f_path, flags, realinode,
>                                        current_cred());
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
>                  file, file, ovl_whatisit(inode, realinode), file->f_flags,
> @@ -176,7 +176,7 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
>
>         old_cred = ovl_override_creds(inode->i_sb);
>         ret = vfs_llseek(real.file, offset, whence);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         file->f_pos = real.file->f_pos;
>         inode_unlock(inode);
> @@ -242,7 +242,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>         old_cred = ovl_override_creds(file_inode(file)->i_sb);
>         ret = vfs_iter_read(real.file, iter, &iocb->ki_pos,
>                             ovl_iocb_to_rwf(iocb));
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         ovl_file_accessed(file);
>
> @@ -278,7 +278,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
>         ret = vfs_iter_write(real.file, iter, &iocb->ki_pos,
>                              ovl_iocb_to_rwf(iocb));
>         file_end_write(real.file);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         /* Update size */
>         ovl_copyattr(ovl_inode_real(inode), inode);
> @@ -305,7 +305,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
>         if (file_inode(real.file) == ovl_inode_upper(file_inode(file))) {
>                 old_cred = ovl_override_creds(file_inode(file)->i_sb);
>                 ret = vfs_fsync_range(real.file, start, end, datasync);
> -               revert_creds(old_cred);
> +               ovl_revert_creds(old_cred);
>         }
>
>         fdput(real);
> @@ -329,7 +329,7 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
>
>         old_cred = ovl_override_creds(file_inode(file)->i_sb);
>         ret = call_mmap(vma->vm_file, vma);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         if (ret) {
>                 /* Drop reference count from new vm_file value */
> @@ -357,7 +357,7 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
>
>         old_cred = ovl_override_creds(file_inode(file)->i_sb);
>         ret = vfs_fallocate(real.file, mode, offset, len);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         /* Update size */
>         ovl_copyattr(ovl_inode_real(inode), inode);
> @@ -379,7 +379,7 @@ static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
>
>         old_cred = ovl_override_creds(file_inode(file)->i_sb);
>         ret = vfs_fadvise(real.file, offset, len, advice);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         fdput(real);
>
> @@ -399,7 +399,7 @@ static long ovl_real_ioctl(struct file *file, unsigned int cmd,
>
>         old_cred = ovl_override_creds(file_inode(file)->i_sb);
>         ret = vfs_ioctl(real.file, cmd, arg);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         fdput(real);
>
> @@ -589,7 +589,7 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
>                                                 flags);
>                 break;
>         }
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         /* Update size */
>         ovl_copyattr(ovl_inode_real(inode_out), inode_out);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index d3b53849615c..6c11c7af5157 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -61,7 +61,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>                 inode_lock(upperdentry->d_inode);
>                 old_cred = ovl_override_creds(dentry->d_sb);
>                 err = notify_change(upperdentry, attr, NULL);
> -               revert_creds(old_cred);
> +               ovl_revert_creds(old_cred);
>                 if (!err)
>                         ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
>                 inode_unlock(upperdentry->d_inode);
> @@ -257,7 +257,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
>                 stat->nlink = dentry->d_inode->i_nlink;
>
>  out:
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return err;
>  }
> @@ -291,7 +291,7 @@ int ovl_permission(struct inode *inode, int mask)
>                 mask |= MAY_READ;
>         }
>         err = inode_permission(realinode, mask);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return err;
>  }
> @@ -308,7 +308,7 @@ static const char *ovl_get_link(struct dentry *dentry,
>
>         old_cred = ovl_override_creds(dentry->d_sb);
>         p = vfs_get_link(ovl_dentry_real(dentry), done);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         return p;
>  }
>
> @@ -351,7 +351,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
>                 WARN_ON(flags != XATTR_REPLACE);
>                 err = vfs_removexattr(realdentry, name);
>         }
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         /* copy c/mtime */
>         ovl_copyattr(d_inode(realdentry), inode);
> @@ -387,7 +387,7 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
>
>         old_cred = ovl_override_creds(dentry->d_sb);
>         res = vfs_getxattr(realdentry, name, value, size);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         return res;
>  }
>
> @@ -411,7 +411,7 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
>
>         old_cred = ovl_override_creds(dentry->d_sb);
>         res = vfs_listxattr(realdentry, list, size);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         if (res <= 0 || size == 0)
>                 return res;
>
> @@ -446,7 +446,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>
>         old_cred = ovl_override_creds(inode->i_sb);
>         acl = get_acl(realinode, type);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return acl;
>  }
> @@ -484,7 +484,7 @@ static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>                 filemap_write_and_wait(realinode->i_mapping);
>
>         err = realinode->i_op->fiemap(realinode, fieinfo, start, len);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return err;
>  }
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index fb6c0cd7b65f..12627018b00a 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -1079,7 +1079,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>                         goto out_free_oe;
>         }
>
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         if (origin_path) {
>                 dput(origin_path->dentry);
>                 kfree(origin_path);
> @@ -1106,7 +1106,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>         kfree(upperredirect);
>  out:
>         kfree(d.redirect);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         return ERR_PTR(err);
>  }
>
> @@ -1160,7 +1160,7 @@ bool ovl_lower_positive(struct dentry *dentry)
>                         dput(this);
>                 }
>         }
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return positive;
>  }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 82574684a9b6..cdbdb533d3bd 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -205,6 +205,7 @@ int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
> +void ovl_revert_creds(const struct cred *oldcred);
>  ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
>                          size_t size);
>  struct super_block *ovl_same_sb(struct super_block *sb);
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 28a2d12a1029..2637c5aadf7f 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -17,6 +17,7 @@ struct ovl_config {
>         bool nfs_export;
>         int xino;
>         bool metacopy;
> +       bool override_creds;
>  };
>
>  struct ovl_sb {
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 47a91c9733a5..f31ef39e5afa 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -286,7 +286,7 @@ static int ovl_check_whiteouts(struct dentry *dir, struct ovl_readdir_data *rdd)
>                 }
>                 inode_unlock(dir->d_inode);
>         }
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>
>         return err;
>  }
> @@ -918,7 +918,7 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>
>         old_cred = ovl_override_creds(dentry->d_sb);
>         err = ovl_dir_read_merged(dentry, list, &root);
> -       revert_creds(old_cred);
> +       ovl_revert_creds(old_cred);
>         if (err)
>                 return err;
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 82e1130de206..c2ddce5d488c 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -53,6 +53,11 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
>  MODULE_PARM_DESC(xino_auto,
>                  "Auto enable xino feature");
>
> +static bool __read_mostly ovl_override_creds_def = true;
> +module_param_named(override_creds, ovl_override_creds_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_override_creds_def,
> +                "Use mounter's credentials for accesses");
> +
>  static void ovl_entry_stack_free(struct ovl_entry *oe)
>  {
>         unsigned int i;
> @@ -362,6 +367,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>         if (ofs->config.metacopy != ovl_metacopy_def)
>                 seq_printf(m, ",metacopy=%s",
>                            ofs->config.metacopy ? "on" : "off");
> +       if (ofs->config.override_creds != ovl_override_creds_def)
> +               seq_show_option(m, "override_creds",
> +                               ofs->config.override_creds ? "on" : "off");
>         return 0;
>  }
>
> @@ -402,6 +410,8 @@ enum {
>         OPT_XINO_AUTO,
>         OPT_METACOPY_ON,
>         OPT_METACOPY_OFF,
> +       OPT_OVERRIDE_CREDS_ON,
> +       OPT_OVERRIDE_CREDS_OFF,
>         OPT_ERR,
>  };
>
> @@ -420,6 +430,8 @@ static const match_table_t ovl_tokens = {
>         {OPT_XINO_AUTO,                 "xino=auto"},
>         {OPT_METACOPY_ON,               "metacopy=on"},
>         {OPT_METACOPY_OFF,              "metacopy=off"},
> +       {OPT_OVERRIDE_CREDS_ON,         "override_creds=on"},
> +       {OPT_OVERRIDE_CREDS_OFF,        "override_creds=off"},
>         {OPT_ERR,                       NULL}
>  };
>
> @@ -478,6 +490,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>         config->redirect_mode = kstrdup(ovl_redirect_mode_def(), GFP_KERNEL);
>         if (!config->redirect_mode)
>                 return -ENOMEM;
> +       config->override_creds = ovl_override_creds_def;
>
>         while ((p = ovl_next_opt(&opt)) != NULL) {
>                 int token;
> @@ -558,6 +571,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>                         config->metacopy = false;
>                         break;
>
> +               case OPT_OVERRIDE_CREDS_ON:
> +                       config->override_creds = true;
> +                       break;
> +
> +               case OPT_OVERRIDE_CREDS_OFF:
> +                       config->override_creds = false;
> +                       break;
> +
>                 default:
>                         pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>                         return -EINVAL;
> @@ -1690,7 +1711,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>                        ovl_dentry_lower(root_dentry), NULL);
>
>         sb->s_root = root_dentry;
> -
>         return 0;
>
>  out_free_oe:
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 672459c3cff7..320aad599bcd 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -37,9 +37,17 @@ const struct cred *ovl_override_creds(struct super_block *sb)
>  {
>         struct ovl_fs *ofs = sb->s_fs_info;
>
> +       if (!ofs->config.override_creds)
> +               return NULL;
>         return override_creds(ofs->creator_cred);
>  }
>
> +void ovl_revert_creds(const struct cred *old_cred)
> +{
> +       if (old_cred)
> +               revert_creds(old_cred);
> +}
> +

Mark,

Not sure if you have seen my "shutdown" patches:
https://lore.kernel.org/linux-fsdevel/20190715133839.9878-4-amir73il@gmail.com/

I am fine with this patch, but would like to request that you add @sb arg
to the ovl_revert_creds() helper, so it is more useful for other things in the
future that scope the underlying layers access (like shutdown).

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking
  2019-07-24 19:57 ` [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking Mark Salyzyn
@ 2019-07-25 11:00   ` Amir Goldstein
  2019-07-25 14:37     ` Mark Salyzyn
  0 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2019-07-25 11:00 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> Check impure, opaque, origin & meta xattr with no sepolicy audit
> (using __vfs_getxattr) since these operations are internal to
> overlayfs operations and do not disclose any data.  This became
> an issue for credential override off since sys_admin would have
> been required by the caller; whereas would have been inherently
> present for the creator since it performed the mount.
>
> This is a change in operations since we do not check in the new
> ovl_vfs_getxattr function if the credential override is off or
> not.  Reasoning is that the sepolicy check is unnecessary overhead,
> especially since the check can be expensive.

I don't know that this reasoning suffice to skip the sepolicy checks
for overlayfs private xattrs.
Can't sepolicy be defined to allow get access to trusted.overlay.*?

>
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: linux-unionfs@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kernel-team@android.com
> ---
> v10 - added to patch series
> ---
>  fs/overlayfs/namei.c     | 12 +++++++-----
>  fs/overlayfs/overlayfs.h |  2 ++
>  fs/overlayfs/util.c      | 24 +++++++++++++++---------
>  3 files changed, 24 insertions(+), 14 deletions(-)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 9702f0d5309d..fb6c0cd7b65f 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -106,10 +106,11 @@ int ovl_check_fh_len(struct ovl_fh *fh, int fh_len)
>
>  static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
>  {
> -       int res, err;
> +       ssize_t res;
> +       int err;
>         struct ovl_fh *fh = NULL;
>
> -       res = vfs_getxattr(dentry, name, NULL, 0);
> +       res = ovl_vfs_getxattr(dentry, name, NULL, 0);
>         if (res < 0) {
>                 if (res == -ENODATA || res == -EOPNOTSUPP)
>                         return NULL;
> @@ -123,7 +124,7 @@ static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
>         if (!fh)
>                 return ERR_PTR(-ENOMEM);
>
> -       res = vfs_getxattr(dentry, name, fh, res);
> +       res = ovl_vfs_getxattr(dentry, name, fh, res);
>         if (res < 0)
>                 goto fail;
>
> @@ -141,10 +142,11 @@ static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
>         return NULL;
>
>  fail:
> -       pr_warn_ratelimited("overlayfs: failed to get origin (%i)\n", res);
> +       pr_warn_ratelimited("overlayfs: failed to get origin (%zi)\n", res);
>         goto out;
>  invalid:
> -       pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n", res, fh);
> +       pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n",
> +                           (int)res, fh);
>         goto out;
>  }
>
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 73a02a263fbc..82574684a9b6 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -205,6 +205,8 @@ int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
>  const struct cred *ovl_override_creds(struct super_block *sb);
> +ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
> +                        size_t size);
>  struct super_block *ovl_same_sb(struct super_block *sb);
>  int ovl_can_decode_fh(struct super_block *sb);
>  struct dentry *ovl_indexdir(struct super_block *sb);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index f5678a3f8350..672459c3cff7 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -40,6 +40,12 @@ const struct cred *ovl_override_creds(struct super_block *sb)
>         return override_creds(ofs->creator_cred);
>  }
>
> +ssize_t ovl_vfs_getxattr(struct dentry *dentry, const char *name, void *buf,
> +                        size_t size)
> +{
> +       return __vfs_getxattr(dentry, d_inode(dentry), name, buf, size);
> +}
> +

When introducing a new ovl_ => vfs_ wrapper, please follow the
ovl_do_XXX helpers
convention in overlayfs.h.

Note that those wrappers do not generally bypass security checks and
you have not
convinced me yet that skipping security checks on the overlayfs
private xattr get
is the right thing to do.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking
  2019-07-25 11:00   ` Amir Goldstein
@ 2019-07-25 14:37     ` Mark Salyzyn
  2019-07-25 15:51       ` Amir Goldstein
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-25 14:37 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

Thanks for the review.

On 7/25/19 4:00 AM, Amir Goldstein wrote:
> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> Check impure, opaque, origin & meta xattr with no sepolicy audit
>> (using __vfs_getxattr) since these operations are internal to
>> overlayfs operations and do not disclose any data.  This became
>> an issue for credential override off since sys_admin would have
>> been required by the caller; whereas would have been inherently
>> present for the creator since it performed the mount.
>>
>> This is a change in operations since we do not check in the new
>> ovl_vfs_getxattr function if the credential override is off or
>> not.  Reasoning is that the sepolicy check is unnecessary overhead,
>> especially since the check can be expensive.
> I don't know that this reasoning suffice to skip the sepolicy checks
> for overlayfs private xattrs.
> Can't sepolicy be defined to allow get access to trusted.overlay.*?

Because for override credentials off, _everyone_ would need it (at least 
on Android, the sole user AFAIK, and only on userdebug builds, not user 
builds), and if everyone is special, and possibly including the random 
applications we add from the play store, then no one is ...

For the override credentials on, the sepolicy would be required to add 
to init or other mounters so that callers can actually use overlayfs. 
Without the sepolicy for init, overlayfs will not function. the xattr 
are in the backing storage and the details are not exported outside of 
the driver. This would represent an imbalance since none of the callers 
would require the sepolicy adjustment for the ;normal' case, but for 
override credentials off as stated above, _everyone_ would require it.

Not against adding the sepolicy in Android, it is how we roll with only 
opening up credentials on an as-need basis. We could deny it on user 
(customer) builds and that closes a door that gains security. However 
our people are starting to resist userdebug being different from user so 
it may be a door I can not shut. Again felt like an imbalance for a 
trusted driver read only operation.

Sincerely -- Mark Salyzyn


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred
  2019-07-25  6:14   ` Amir Goldstein
@ 2019-07-25 14:38     ` Mark Salyzyn
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-25 14:38 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On 7/24/19 11:14 PM, Amir Goldstein wrote:
>> +void ovl_revert_creds(const struct cred *old_cred)
>> +{
>> +       if (old_cred)
>> +               revert_creds(old_cred);
>> +}
>> +
> Mark,
>
> Not sure if you have seen my "shutdown" patches:
> https://lore.kernel.org/linux-fsdevel/20190715133839.9878-4-amir73il@gmail.com/

Good to know!

>
> I am fine with this patch, but would like to request that you add @sb arg
> to the ovl_revert_creds() helper, so it is more useful for other things in the
> future that scope the underlying layers access (like shutdown).

Will respin and retest.

-- Mark


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-25  5:48   ` Amir Goldstein
@ 2019-07-25 15:03     ` Mark Salyzyn
  2019-07-25 15:43       ` Amir Goldstein
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-25 15:03 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On 7/24/19 10:48 PM, Amir Goldstein wrote:
> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> Because of the overlayfs getxattr recursion, the incoming inode fails
>> to update the selinux sid resulting in avc denials being reported
>> against a target context of u:object_r:unlabeled:s0.
> This description is too brief for me to understand the root problem.
> What's wring with the overlayfs getxattr recursion w.r.t the selinux
> security model?

__vfs_getxattr (the way the security layer acquires the target sid 
without recursing back to security to check the access permissions) 
calls get xattr method, which in overlayfs calls vfs_getxattr on the 
lower layer (which then recurses back to security to check permissions) 
and reports back -EACCES if there was a denial (which is OK) and _no_ 
sid copied to caller's inode security data, bubbles back to the security 
layer caller, which reports an invalid avc: message for 
u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid for 
the lower filesystem target). The blocked access is 100% valid, it is 
supposed to be blocked. This does however result in a cosmetic issue 
that makes it impossible to use audit2allow to construct a rule that 
would be usable to fix the access problem.

This problem would exist for any (in tree or out-of-tree) union 
filesystems that need to reflect the __vfs_getxattr call into a 
__vfs_getxattr call to the underlying filesystem.

>
> Please give an example of your unprivileged mounter use case
> to explain.

The mounter merely does not have access to the targets in one of the 
referenced filesystems (for override creds = on)

In Android would be init, it does not have access to a subset of 
u:object_r:*_exec:s0 and u::objects_r:vendor*:s0 labels. Based on a need 
to access basis.

This gets worse if we add override creds = off, because the multitude of 
callers could be denied access, and rightfully so, and we would have no 
clue how to construct rules to grant them permissions using standard 
tools like audit2allow.

I will figure out how to explain this in the commit message ... but do 
tell me if I did not 'connect' you to the underlying problem so I can 
make it clear to _everyone_.

>
> CC Vivek because I could really never understand all this.
>
>> Solution is to add a _get xattr method that calls the __vfs_getxattr
>> handler so that the context can be read in, rather than being denied
>> with an -EACCES when vfs_getxattr handler is called.
>> . . .
>> @@ -972,6 +989,7 @@ static const struct xattr_handler ovl_own_xattr_handler = {
>>   static const struct xattr_handler ovl_other_xattr_handler = {
>>          .prefix = "", /* catch all */
>>          .get = ovl_other_xattr_get,
>> +       .__get = __ovl_other_xattr_get,
>>          .set = ovl_other_xattr_set,
>>   };
>>
>
> Not very professional of me to comment on the proposed solution
> without understanding the problem, but my nose says this cannot
> be the right solution and if it is, then you better find a much better
> name for the API then __get() and document it properly.

Yes __get (instead of the existing get which checks sepolicy) was my 
idea. get_wo_security was a close alternative.

We worked through 5 "solutions", this one privately appeared to please 
the security folks. In fact the solution was their suggestion because 
they noticed that __vfs_getxattr was meant to bypass sepolicy checking, 
yet when nested through overlayfs, it called vfs_getxattr for the lower 
filesystems and blocked necessary content (sid actually) from the upper 
call in order to properly log the denial. I had originally copied just 
the sid to the upper caller by adding layering violations that creeped 
them out ;-/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-25 15:03     ` Mark Salyzyn
@ 2019-07-25 15:43       ` Amir Goldstein
  2019-07-25 16:22         ` Mark Salyzyn
  0 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2019-07-25 15:43 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On Thu, Jul 25, 2019 at 6:03 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> On 7/24/19 10:48 PM, Amir Goldstein wrote:
> > On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
> >> Because of the overlayfs getxattr recursion, the incoming inode fails
> >> to update the selinux sid resulting in avc denials being reported
> >> against a target context of u:object_r:unlabeled:s0.
> > This description is too brief for me to understand the root problem.
> > What's wring with the overlayfs getxattr recursion w.r.t the selinux
> > security model?
>
> __vfs_getxattr (the way the security layer acquires the target sid
> without recursing back to security to check the access permissions)
> calls get xattr method, which in overlayfs calls vfs_getxattr on the
> lower layer (which then recurses back to security to check permissions)
> and reports back -EACCES if there was a denial (which is OK) and _no_
> sid copied to caller's inode security data, bubbles back to the security
> layer caller, which reports an invalid avc: message for
> u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid for
> the lower filesystem target). The blocked access is 100% valid, it is
> supposed to be blocked. This does however result in a cosmetic issue
> that makes it impossible to use audit2allow to construct a rule that
> would be usable to fix the access problem.
>

Ahhh you are talking about getting the security.selinux.* xattrs?
I was under the impression (Vivek please correct me if I wrong)
that overlayfs objects cannot have individual security labels and
the only way to label overlayfs objects is by mount options on the
entire mount? Or is this just for lower layer objects?

Anyway, the API I would go for is adding a @flags argument to
get() which can take XATTR_NOSECURITY akin to
FMODE_NONOTIFY, GFP_NOFS, meant to avoid recursions.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking
  2019-07-25 14:37     ` Mark Salyzyn
@ 2019-07-25 15:51       ` Amir Goldstein
  0 siblings, 0 replies; 19+ messages in thread
From: Amir Goldstein @ 2019-07-25 15:51 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On Thu, Jul 25, 2019 at 5:37 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> Thanks for the review.
>
> On 7/25/19 4:00 AM, Amir Goldstein wrote:
> > On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
> >> Check impure, opaque, origin & meta xattr with no sepolicy audit
> >> (using __vfs_getxattr) since these operations are internal to
> >> overlayfs operations and do not disclose any data.  This became
> >> an issue for credential override off since sys_admin would have
> >> been required by the caller; whereas would have been inherently
> >> present for the creator since it performed the mount.
> >>
> >> This is a change in operations since we do not check in the new
> >> ovl_vfs_getxattr function if the credential override is off or
> >> not.  Reasoning is that the sepolicy check is unnecessary overhead,
> >> especially since the check can be expensive.
> > I don't know that this reasoning suffice to skip the sepolicy checks
> > for overlayfs private xattrs.
> > Can't sepolicy be defined to allow get access to trusted.overlay.*?
>
> Because for override credentials off, _everyone_ would need it (at least
> on Android, the sole user AFAIK, and only on userdebug builds, not user
> builds), and if everyone is special, and possibly including the random
> applications we add from the play store, then no one is ...
>

OK. I am convinced.

One weak argument in favor of the patch:
ecryptfs also uses __vfs_getxattr for private xattrs.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-25 15:43       ` Amir Goldstein
@ 2019-07-25 16:22         ` Mark Salyzyn
  2019-07-26  5:04           ` Amir Goldstein
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-25 16:22 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On 7/25/19 8:43 AM, Amir Goldstein wrote:
> On Thu, Jul 25, 2019 at 6:03 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> On 7/24/19 10:48 PM, Amir Goldstein wrote:
>>> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>>>> Because of the overlayfs getxattr recursion, the incoming inode fails
>>>> to update the selinux sid resulting in avc denials being reported
>>>> against a target context of u:object_r:unlabeled:s0.
>>> This description is too brief for me to understand the root problem.
>>> What's wring with the overlayfs getxattr recursion w.r.t the selinux
>>> security model?
>> __vfs_getxattr (the way the security layer acquires the target sid
>> without recursing back to security to check the access permissions)
>> calls get xattr method, which in overlayfs calls vfs_getxattr on the
>> lower layer (which then recurses back to security to check permissions)
>> and reports back -EACCES if there was a denial (which is OK) and _no_
>> sid copied to caller's inode security data, bubbles back to the security
>> layer caller, which reports an invalid avc: message for
>> u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid for
>> the lower filesystem target). The blocked access is 100% valid, it is
>> supposed to be blocked. This does however result in a cosmetic issue
>> that makes it impossible to use audit2allow to construct a rule that
>> would be usable to fix the access problem.
>>
> Ahhh you are talking about getting the security.selinux.* xattrs?
> I was under the impression (Vivek please correct me if I wrong)
> that overlayfs objects cannot have individual security labels and

They can, and we _need_ them for Android's use cases, upper and lower 
filesystems.

Some (most?) union filesystems (like Android's sdcardfs) set sepolicy 
from the mount options, we did not need this adjustment there of course.

> the only way to label overlayfs objects is by mount options on the
> entire mount? Or is this just for lower layer objects?
>
> Anyway, the API I would go for is adding a @flags argument to
> get() which can take XATTR_NOSECURITY akin to
> FMODE_NONOTIFY, GFP_NOFS, meant to avoid recursions.

I do like it better (with the following 7 stages of grief below), best 
for the future.

The change in this handler's API will affect all filesystem drivers 
(well, my change affects the ABI, so it is not as-if I saved the world 
from a module recompile) touching all filesystem sources with an even 
larger audience of stakeholders. Larger audience of stakeholders, the 
harder to get the change in ;-/. This is also concerning since I would 
like this change to go to stable 4.4, 4.9, 4.14 and 4.19 where this 
regression got introduced. I can either craft specific stable patches or 
just let it go and deal with them in the android-common distributions 
rather than seeking stable merged down. ABI/API breaks are a problem for 
stable anyway ...

-- Mark


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-25 16:22         ` Mark Salyzyn
@ 2019-07-26  5:04           ` Amir Goldstein
  2019-07-26 18:30             ` Mark Salyzyn
  0 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2019-07-26  5:04 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On Thu, Jul 25, 2019 at 7:22 PM Mark Salyzyn <salyzyn@android.com> wrote:
>
> On 7/25/19 8:43 AM, Amir Goldstein wrote:
> > On Thu, Jul 25, 2019 at 6:03 PM Mark Salyzyn <salyzyn@android.com> wrote:
> >> On 7/24/19 10:48 PM, Amir Goldstein wrote:
> >>> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
> >>>> Because of the overlayfs getxattr recursion, the incoming inode fails
> >>>> to update the selinux sid resulting in avc denials being reported
> >>>> against a target context of u:object_r:unlabeled:s0.
> >>> This description is too brief for me to understand the root problem.
> >>> What's wring with the overlayfs getxattr recursion w.r.t the selinux
> >>> security model?
> >> __vfs_getxattr (the way the security layer acquires the target sid
> >> without recursing back to security to check the access permissions)
> >> calls get xattr method, which in overlayfs calls vfs_getxattr on the
> >> lower layer (which then recurses back to security to check permissions)
> >> and reports back -EACCES if there was a denial (which is OK) and _no_
> >> sid copied to caller's inode security data, bubbles back to the security
> >> layer caller, which reports an invalid avc: message for
> >> u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid for
> >> the lower filesystem target). The blocked access is 100% valid, it is
> >> supposed to be blocked. This does however result in a cosmetic issue
> >> that makes it impossible to use audit2allow to construct a rule that
> >> would be usable to fix the access problem.
> >>
> > Ahhh you are talking about getting the security.selinux.* xattrs?
> > I was under the impression (Vivek please correct me if I wrong)
> > that overlayfs objects cannot have individual security labels and
>
> They can, and we _need_ them for Android's use cases, upper and lower
> filesystems.
>
> Some (most?) union filesystems (like Android's sdcardfs) set sepolicy
> from the mount options, we did not need this adjustment there of course.
>
> > the only way to label overlayfs objects is by mount options on the
> > entire mount? Or is this just for lower layer objects?
> >
> > Anyway, the API I would go for is adding a @flags argument to
> > get() which can take XATTR_NOSECURITY akin to
> > FMODE_NONOTIFY, GFP_NOFS, meant to avoid recursions.
>
> I do like it better (with the following 7 stages of grief below), best
> for the future.
>
> The change in this handler's API will affect all filesystem drivers
> (well, my change affects the ABI, so it is not as-if I saved the world
> from a module recompile) touching all filesystem sources with an even
> larger audience of stakeholders. Larger audience of stakeholders, the
> harder to get the change in ;-/. This is also concerning since I would
> like this change to go to stable 4.4, 4.9, 4.14 and 4.19 where this
> regression got introduced. I can either craft specific stable patches or
> just let it go and deal with them in the android-common distributions
> rather than seeking stable merged down. ABI/API breaks are a problem for
> stable anyway ...
>

Use the memalloc_nofs_save/restore design pattern will avoid all that
grief.
As a matter of fact, this issue could and should be handled inside security
subsystem without bothering any other subsystem.
LSM have per task context right? That context could carry the recursion
flags to know that the getxattr call is made by the security subsystem itself.
The problem is not limited to union filesystems.
In general its a stacking issue. ecryptfs is also a stacking fs, out-of-tree
shiftfs as well. But it doesn't end there.
A filesystem on top of a loop device inside another filesystem could
also maybe result in security hook recursion (not sure if in practice).

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-26  5:04           ` Amir Goldstein
@ 2019-07-26 18:30             ` Mark Salyzyn
  2019-07-30 15:55               ` Stephen Smalley
  0 siblings, 1 reply; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-26 18:30 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc

On 7/25/19 10:04 PM, Amir Goldstein wrote:
> On Thu, Jul 25, 2019 at 7:22 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> On 7/25/19 8:43 AM, Amir Goldstein wrote:
>>> On Thu, Jul 25, 2019 at 6:03 PM Mark Salyzyn <salyzyn@android.com> wrote:
>>>> On 7/24/19 10:48 PM, Amir Goldstein wrote:
>>>>> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn <salyzyn@android.com> wrote:
>>>>>> Because of the overlayfs getxattr recursion, the incoming inode fails
>>>>>> to update the selinux sid resulting in avc denials being reported
>>>>>> against a target context of u:object_r:unlabeled:s0.
>>>>> This description is too brief for me to understand the root problem.
>>>>> What's wring with the overlayfs getxattr recursion w.r.t the selinux
>>>>> security model?
>>>> __vfs_getxattr (the way the security layer acquires the target sid
>>>> without recursing back to security to check the access permissions)
>>>> calls get xattr method, which in overlayfs calls vfs_getxattr on the
>>>> lower layer (which then recurses back to security to check permissions)
>>>> and reports back -EACCES if there was a denial (which is OK) and _no_
>>>> sid copied to caller's inode security data, bubbles back to the security
>>>> layer caller, which reports an invalid avc: message for
>>>> u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid for
>>>> the lower filesystem target). The blocked access is 100% valid, it is
>>>> supposed to be blocked. This does however result in a cosmetic issue
>>>> that makes it impossible to use audit2allow to construct a rule that
>>>> would be usable to fix the access problem.
>>>>
>>> Ahhh you are talking about getting the security.selinux.* xattrs?
>>> I was under the impression (Vivek please correct me if I wrong)
>>> that overlayfs objects cannot have individual security labels and
>> They can, and we _need_ them for Android's use cases, upper and lower
>> filesystems.
>>
>> Some (most?) union filesystems (like Android's sdcardfs) set sepolicy
>> from the mount options, we did not need this adjustment there of course.
>>
>>> the only way to label overlayfs objects is by mount options on the
>>> entire mount? Or is this just for lower layer objects?
>>>
>>> Anyway, the API I would go for is adding a @flags argument to
>>> get() which can take XATTR_NOSECURITY akin to
>>> FMODE_NONOTIFY, GFP_NOFS, meant to avoid recursions.
>> I do like it better (with the following 7 stages of grief below), best
>> for the future.
>>
>> The change in this handler's API will affect all filesystem drivers
>> (well, my change affects the ABI, so it is not as-if I saved the world
>> from a module recompile) touching all filesystem sources with an even
>> larger audience of stakeholders. Larger audience of stakeholders, the
>> harder to get the change in ;-/. This is also concerning since I would
>> like this change to go to stable 4.4, 4.9, 4.14 and 4.19 where this
>> regression got introduced. I can either craft specific stable patches or
>> just let it go and deal with them in the android-common distributions
>> rather than seeking stable merged down. ABI/API breaks are a problem for
>> stable anyway ...
>>
> Use the memalloc_nofs_save/restore design pattern will avoid all that
> grief.
> As a matter of fact, this issue could and should be handled inside security
> subsystem without bothering any other subsystem.
> LSM have per task context right? That context could carry the recursion
> flags to know that the getxattr call is made by the security subsystem itself.
> The problem is not limited to union filesystems.
> In general its a stacking issue. ecryptfs is also a stacking fs, out-of-tree
> shiftfs as well. But it doesn't end there.
> A filesystem on top of a loop device inside another filesystem could
> also maybe result in security hook recursion (not sure if in practice).
>
> Thanks,
> Amir.

Good point, back to Stephen Smalley?

There are four __vfs_getxattr calls inside security, not sure I see any 
natural way to determine the recursion in security/selinux I can 
beg/borrow/steal from; but I get the strange feeling that it is better 
to detect recursion in __vfs_getxattr in this manner, and switch out 
checking in vfs_getxattr since it is localized to just fs/xattr.c. 
selinux might not be the only user of __vfs_getxattr nature ...

I have implemented and tested the solution where we add a flag to the 
.get method, it works. I would be tempted to submit that instead in case 
someone in the future can imagine using that flag argument to solve 
other problem(s) (if you build it, they will come).

<flips coin>

Will add a new per-process flag that __vfs_getxattr and vfs_getxattr 
plays with and see how it works and what it looks like.

Sincerely -- Mark Salyzyn


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-26 18:30             ` Mark Salyzyn
@ 2019-07-30 15:55               ` Stephen Smalley
  2019-07-30 16:54                 ` Mark Salyzyn
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Smalley @ 2019-07-30 15:55 UTC (permalink / raw)
  To: Mark Salyzyn, Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, overlayfs,
	linux-doc

On 7/26/19 2:30 PM, Mark Salyzyn wrote:
> On 7/25/19 10:04 PM, Amir Goldstein wrote:
>> On Thu, Jul 25, 2019 at 7:22 PM Mark Salyzyn <salyzyn@android.com> wrote:
>>> On 7/25/19 8:43 AM, Amir Goldstein wrote:
>>>> On Thu, Jul 25, 2019 at 6:03 PM Mark Salyzyn <salyzyn@android.com> 
>>>> wrote:
>>>>> On 7/24/19 10:48 PM, Amir Goldstein wrote:
>>>>>> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn 
>>>>>> <salyzyn@android.com> wrote:
>>>>>>> Because of the overlayfs getxattr recursion, the incoming inode 
>>>>>>> fails
>>>>>>> to update the selinux sid resulting in avc denials being reported
>>>>>>> against a target context of u:object_r:unlabeled:s0.
>>>>>> This description is too brief for me to understand the root problem.
>>>>>> What's wring with the overlayfs getxattr recursion w.r.t the selinux
>>>>>> security model?
>>>>> __vfs_getxattr (the way the security layer acquires the target sid
>>>>> without recursing back to security to check the access permissions)
>>>>> calls get xattr method, which in overlayfs calls vfs_getxattr on the
>>>>> lower layer (which then recurses back to security to check 
>>>>> permissions)
>>>>> and reports back -EACCES if there was a denial (which is OK) and _no_
>>>>> sid copied to caller's inode security data, bubbles back to the 
>>>>> security
>>>>> layer caller, which reports an invalid avc: message for
>>>>> u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid for
>>>>> the lower filesystem target). The blocked access is 100% valid, it is
>>>>> supposed to be blocked. This does however result in a cosmetic issue
>>>>> that makes it impossible to use audit2allow to construct a rule that
>>>>> would be usable to fix the access problem.
>>>>>
>>>> Ahhh you are talking about getting the security.selinux.* xattrs?
>>>> I was under the impression (Vivek please correct me if I wrong)
>>>> that overlayfs objects cannot have individual security labels and
>>> They can, and we _need_ them for Android's use cases, upper and lower
>>> filesystems.
>>>
>>> Some (most?) union filesystems (like Android's sdcardfs) set sepolicy
>>> from the mount options, we did not need this adjustment there of course.
>>>
>>>> the only way to label overlayfs objects is by mount options on the
>>>> entire mount? Or is this just for lower layer objects?
>>>>
>>>> Anyway, the API I would go for is adding a @flags argument to
>>>> get() which can take XATTR_NOSECURITY akin to
>>>> FMODE_NONOTIFY, GFP_NOFS, meant to avoid recursions.
>>> I do like it better (with the following 7 stages of grief below), best
>>> for the future.
>>>
>>> The change in this handler's API will affect all filesystem drivers
>>> (well, my change affects the ABI, so it is not as-if I saved the world
>>> from a module recompile) touching all filesystem sources with an even
>>> larger audience of stakeholders. Larger audience of stakeholders, the
>>> harder to get the change in ;-/. This is also concerning since I would
>>> like this change to go to stable 4.4, 4.9, 4.14 and 4.19 where this
>>> regression got introduced. I can either craft specific stable patches or
>>> just let it go and deal with them in the android-common distributions
>>> rather than seeking stable merged down. ABI/API breaks are a problem for
>>> stable anyway ...
>>>
>> Use the memalloc_nofs_save/restore design pattern will avoid all that
>> grief.
>> As a matter of fact, this issue could and should be handled inside 
>> security
>> subsystem without bothering any other subsystem.
>> LSM have per task context right? That context could carry the recursion
>> flags to know that the getxattr call is made by the security subsystem 
>> itself.
>> The problem is not limited to union filesystems.
>> In general its a stacking issue. ecryptfs is also a stacking fs, 
>> out-of-tree
>> shiftfs as well. But it doesn't end there.
>> A filesystem on top of a loop device inside another filesystem could
>> also maybe result in security hook recursion (not sure if in practice).
>>
>> Thanks,
>> Amir.
> 
> Good point, back to Stephen Smalley?
> 
> There are four __vfs_getxattr calls inside security, not sure I see any 
> natural way to determine the recursion in security/selinux I can 
> beg/borrow/steal from; but I get the strange feeling that it is better 
> to detect recursion in __vfs_getxattr in this manner, and switch out 
> checking in vfs_getxattr since it is localized to just fs/xattr.c. 
> selinux might not be the only user of __vfs_getxattr nature ...
> 
> I have implemented and tested the solution where we add a flag to the 
> .get method, it works. I would be tempted to submit that instead in case 
> someone in the future can imagine using that flag argument to solve 
> other problem(s) (if you build it, they will come).
> 
> <flips coin>
> 
> Will add a new per-process flag that __vfs_getxattr and vfs_getxattr 
> plays with and see how it works and what it looks like.

As you say, SELinux is not the only user of __vfs_getxattr; in addition 
to the other security modules, there is the integrity/evm subsystem and 
ecryptfs.  Further, __vfs_getxattr does not merely skip 
LSM/SELinux-related processing; it also skips xattr_permission().  As 
such, I don't believe this is something that can be solved entirely 
within the security subsystem.

Not excited about a process flag to implicitly disable LSM/SELinux and 
other security-related processing on a code path; potential for abuse is 
high.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v10 3/5] overlayfs: add __get xattr method
  2019-07-30 15:55               ` Stephen Smalley
@ 2019-07-30 16:54                 ` Mark Salyzyn
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Salyzyn @ 2019-07-30 16:54 UTC (permalink / raw)
  To: Stephen Smalley, Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, overlayfs,
	linux-doc

On 7/30/19 8:55 AM, Stephen Smalley wrote:
> On 7/26/19 2:30 PM, Mark Salyzyn wrote:
>> On 7/25/19 10:04 PM, Amir Goldstein wrote:
>>> On Thu, Jul 25, 2019 at 7:22 PM Mark Salyzyn <salyzyn@android.com> 
>>> wrote:
>>>> On 7/25/19 8:43 AM, Amir Goldstein wrote:
>>>>> On Thu, Jul 25, 2019 at 6:03 PM Mark Salyzyn <salyzyn@android.com> 
>>>>> wrote:
>>>>>> On 7/24/19 10:48 PM, Amir Goldstein wrote:
>>>>>>> On Wed, Jul 24, 2019 at 10:57 PM Mark Salyzyn 
>>>>>>> <salyzyn@android.com> wrote:
>>>>>>>> Because of the overlayfs getxattr recursion, the incoming inode 
>>>>>>>> fails
>>>>>>>> to update the selinux sid resulting in avc denials being reported
>>>>>>>> against a target context of u:object_r:unlabeled:s0.
>>>>>>> This description is too brief for me to understand the root 
>>>>>>> problem.
>>>>>>> What's wring with the overlayfs getxattr recursion w.r.t the 
>>>>>>> selinux
>>>>>>> security model?
>>>>>> __vfs_getxattr (the way the security layer acquires the target sid
>>>>>> without recursing back to security to check the access permissions)
>>>>>> calls get xattr method, which in overlayfs calls vfs_getxattr on the
>>>>>> lower layer (which then recurses back to security to check 
>>>>>> permissions)
>>>>>> and reports back -EACCES if there was a denial (which is OK) and 
>>>>>> _no_
>>>>>> sid copied to caller's inode security data, bubbles back to the 
>>>>>> security
>>>>>> layer caller, which reports an invalid avc: message for
>>>>>> u:object_r:unlabeled:s0 (the uninitialized sid instead of the sid 
>>>>>> for
>>>>>> the lower filesystem target). The blocked access is 100% valid, 
>>>>>> it is
>>>>>> supposed to be blocked. This does however result in a cosmetic issue
>>>>>> that makes it impossible to use audit2allow to construct a rule that
>>>>>> would be usable to fix the access problem.
>>>>>>
>>>>> Ahhh you are talking about getting the security.selinux.* xattrs?
>>>>> I was under the impression (Vivek please correct me if I wrong)
>>>>> that overlayfs objects cannot have individual security labels and
>>>> They can, and we _need_ them for Android's use cases, upper and lower
>>>> filesystems.
>>>>
>>>> Some (most?) union filesystems (like Android's sdcardfs) set sepolicy
>>>> from the mount options, we did not need this adjustment there of 
>>>> course.
>>>>
>>>>> the only way to label overlayfs objects is by mount options on the
>>>>> entire mount? Or is this just for lower layer objects?
>>>>>
>>>>> Anyway, the API I would go for is adding a @flags argument to
>>>>> get() which can take XATTR_NOSECURITY akin to
>>>>> FMODE_NONOTIFY, GFP_NOFS, meant to avoid recursions.
>>>> I do like it better (with the following 7 stages of grief below), best
>>>> for the future.
>>>>
>>>> The change in this handler's API will affect all filesystem drivers
>>>> (well, my change affects the ABI, so it is not as-if I saved the world
>>>> from a module recompile) touching all filesystem sources with an even
>>>> larger audience of stakeholders. Larger audience of stakeholders, the
>>>> harder to get the change in ;-/. This is also concerning since I would
>>>> like this change to go to stable 4.4, 4.9, 4.14 and 4.19 where this
>>>> regression got introduced. I can either craft specific stable 
>>>> patches or
>>>> just let it go and deal with them in the android-common distributions
>>>> rather than seeking stable merged down. ABI/API breaks are a 
>>>> problem for
>>>> stable anyway ...
>>>>
>>> Use the memalloc_nofs_save/restore design pattern will avoid all that
>>> grief.
>>> As a matter of fact, this issue could and should be handled inside 
>>> security
>>> subsystem without bothering any other subsystem.
>>> LSM have per task context right? That context could carry the recursion
>>> flags to know that the getxattr call is made by the security 
>>> subsystem itself.
>>> The problem is not limited to union filesystems.
>>> In general its a stacking issue. ecryptfs is also a stacking fs, 
>>> out-of-tree
>>> shiftfs as well. But it doesn't end there.
>>> A filesystem on top of a loop device inside another filesystem could
>>> also maybe result in security hook recursion (not sure if in practice).
>>>
>>> Thanks,
>>> Amir.
>>
>> Good point, back to Stephen Smalley?
>>
>> There are four __vfs_getxattr calls inside security, not sure I see 
>> any natural way to determine the recursion in security/selinux I can 
>> beg/borrow/steal from; but I get the strange feeling that it is 
>> better to detect recursion in __vfs_getxattr in this manner, and 
>> switch out checking in vfs_getxattr since it is localized to just 
>> fs/xattr.c. selinux might not be the only user of __vfs_getxattr 
>> nature ...
>>
>> I have implemented and tested the solution where we add a flag to the 
>> .get method, it works. I would be tempted to submit that instead in 
>> case someone in the future can imagine using that flag argument to 
>> solve other problem(s) (if you build it, they will come).
>>
>> <flips coin>
>>
>> Will add a new per-process flag that __vfs_getxattr and vfs_getxattr 
>> plays with and see how it works and what it looks like.
>
> As you say, SELinux is not the only user of __vfs_getxattr; in 
> addition to the other security modules, there is the integrity/evm 
> subsystem and ecryptfs.  Further, __vfs_getxattr does not merely skip 
> LSM/SELinux-related processing; it also skips xattr_permission().  As 
> such, I don't believe this is something that can be solved entirely 
> within the security subsystem.
>
> Not excited about a process flag to implicitly disable LSM/SELinux and 
> other security-related processing on a code path; potential for abuse 
> is high.

So you will not like my solution in "[PATCH v11 2/5] fs: __vfs_getxattr 
nesting paradigm"sent out this morning; so adding the flag option and 
widespread touching of _all_ the filesystem xattr.c/acl.c/inode.c/etc 
files to the calls is probably the easiest to stomach with the lowest 
attack surface.

Any other ideas (with less impact to tons of API/ABI/filesystems) that 
we have not thought about before I spin a v12 patch set?

-- Mark


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-07-30 16:54 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-24 19:57 [PATCH v10 0/2] overlayfs override_creds=off Mark Salyzyn
2019-07-24 19:57 ` [PATCH v10 1/5] overlayfs: check CAP_DAC_READ_SEARCH before issuing exportfs_decode_fh Mark Salyzyn
2019-07-24 19:57 ` [PATCH v10 2/5] Add optional __get xattr method paired to __vfs_getxattr Mark Salyzyn
2019-07-24 19:57 ` [PATCH v10 3/5] overlayfs: add __get xattr method Mark Salyzyn
2019-07-25  5:48   ` Amir Goldstein
2019-07-25 15:03     ` Mark Salyzyn
2019-07-25 15:43       ` Amir Goldstein
2019-07-25 16:22         ` Mark Salyzyn
2019-07-26  5:04           ` Amir Goldstein
2019-07-26 18:30             ` Mark Salyzyn
2019-07-30 15:55               ` Stephen Smalley
2019-07-30 16:54                 ` Mark Salyzyn
2019-07-24 19:57 ` [PATCH v10 4/5] overlayfs: internal getxattr operations without sepolicy checking Mark Salyzyn
2019-07-25 11:00   ` Amir Goldstein
2019-07-25 14:37     ` Mark Salyzyn
2019-07-25 15:51       ` Amir Goldstein
2019-07-24 19:57 ` [PATCH v10 5/5] overlayfs: override_creds=off option bypass creator_cred Mark Salyzyn
2019-07-25  6:14   ` Amir Goldstein
2019-07-25 14:38     ` Mark Salyzyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).