ecryptfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] fs: tweak and switch more fses to private mounts
@ 2021-04-14 12:37 Christian Brauner
  2021-04-14 12:37 ` [PATCH 1/7] namespace: fix clone_private_mount() kernel doc Christian Brauner
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

Hey,

Since [1] we support creating private mounts from a given path's
vfsmount. This makes them very suitable for any filesystem or
filesystem functionality that piggybacks on paths of another filesystem.
Overlayfs, cachefiles, and ecryptfs are three prime examples.

Since private mounts aren't attached in the filesystem they aren't
affected by mount property changes after the respective fs makes use of
them. This seems a rather desirable property as the underlying path
can't e.g. suddenly go from read-write to read-only and in general it
means that the fs is always in full control of the underlying mount
after the user has allowed it to be used (apart from operations that
affect the superblock of course).

Besides that it also makes things simpler for a variety of other vfs
features. One concrete example is fanotify. When the path->mnt of the
path that is used as a cache has been marked with FAN_MARK_MOUNT the
semantics get tricky as it isn't clear whether the watchers of path->mnt
should get notified about fsnotify events when files are created by
ecryptfs via path->mnt. Using a private mount lets us handle this case
too and aligns the behavior of stacks created by overlayfs.

Thanks!
Christian

[1]: c771d683a62e ("vfs: introduce clone_private_mount()")

Christian Brauner (7):
  namespace: fix clone_private_mount() kernel doc
  namespace: add kernel doc for mnt_clone_internal()
  namespace: move unbindable check out of clone_private_mount()
  cachefiles: switch to using a private mount
  cachefiles: extend ro check to private mount
  ecryptfs: switch to using a private mount
  ecryptfs: extend ro check to private mount

 fs/cachefiles/bind.c          | 41 +++++++++++++++++++++++++----------
 fs/ecryptfs/dentry.c          |  6 ++++-
 fs/ecryptfs/ecryptfs_kernel.h |  9 ++++++++
 fs/ecryptfs/inode.c           |  5 ++++-
 fs/ecryptfs/main.c            | 31 +++++++++++++++++++++-----
 fs/namespace.c                | 36 ++++++++++++++++++++++++------
 fs/overlayfs/super.c          | 13 +++++++++--
 7 files changed, 113 insertions(+), 28 deletions(-)


base-commit: e49d033bddf5b565044e2abe4241353959bc9120
-- 
2.27.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] namespace: fix clone_private_mount() kernel doc
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  2021-04-14 12:37 ` [PATCH 2/7] namespace: add kernel doc for mnt_clone_internal() Christian Brauner
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

Extend the kernel documentation for clone_private_mount(). Add some more
detailed info about its usage and convert it into proper kernel doc.

Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/namespace.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 56bb5a5fdc0d..02f415061efe 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1939,12 +1939,21 @@ void drop_collected_mounts(struct vfsmount *mnt)
 
 /**
  * clone_private_mount - create a private clone of a path
+ * @path: path from which the mnt to clone will be taken
  *
- * This creates a new vfsmount, which will be the clone of @path.  The new will
- * not be attached anywhere in the namespace and will be private (i.e. changes
- * to the originating mount won't be propagated into this).
+ * This creates a new vfsmount, which will be a clone of @path's vfsmount.
  *
- * Release with mntput().
+ * In contrast to mnt_clone_internal() the new mount will not be marked
+ * MNT_INTERNAL but will have MNT_NS_INTERNAL attached as its mount namespace
+ * making it suitable for long-term mounts since mntput()ing it will always hit
+ * the fastpath as long as kern_unmount() hasn't been called.
+ *
+ * Since the mount is not reachable anwyhere mount properties and propagation
+ * properties remain stable, i.e. cannot change.
+ *
+ * Useable with mntget()/mntput() but needs to be released with kern_unmount().
+ *
+ * Return: A clone of @path's vfsmount on success, an error pointer on failure.
  */
 struct vfsmount *clone_private_mount(const struct path *path)
 {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/7] namespace: add kernel doc for mnt_clone_internal()
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
  2021-04-14 12:37 ` [PATCH 1/7] namespace: fix clone_private_mount() kernel doc Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  2021-04-14 12:37 ` [PATCH 3/7] namespace: move unbindable check out of clone_private_mount() Christian Brauner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

Document mnt_clone_internal().

Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/namespace.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 02f415061efe..7ffefa8b3980 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1271,6 +1271,22 @@ bool path_is_mountpoint(const struct path *path)
 }
 EXPORT_SYMBOL(path_is_mountpoint);
 
+/**
+ * mnt_clone_internal - create a private clone of a path
+ * @path: path from which the mnt to clone will be taken
+ *
+ * This creates a new vfsmount, which will be a clone of @path's vfsmount.
+ *
+ * In contrast to clone_private_mount() the new mount will be marked
+ * MNT_INTERNAL and will note have any mount namespace attached making it
+ * suitable for short-lived internal mounts since mntput()ing it will always
+ * hit the slowpath taking the mount lock.
+ *
+ * Since the mount is not reachable anwyhere mount properties and propagation
+ * properties remain stable, i.e. cannot change.
+ *
+ * Return: A clone of @path's vfsmount on success, an error pointer on failure.
+ */
 struct vfsmount *mnt_clone_internal(const struct path *path)
 {
 	struct mount *p;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/7] namespace: move unbindable check out of clone_private_mount()
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
  2021-04-14 12:37 ` [PATCH 1/7] namespace: fix clone_private_mount() kernel doc Christian Brauner
  2021-04-14 12:37 ` [PATCH 2/7] namespace: add kernel doc for mnt_clone_internal() Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  2021-04-14 12:37 ` [PATCH 4/7] cachefiles: switch to using a private mount Christian Brauner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

We're about to switch all filesystems that stack on top or otherwise use
a struct path of another filesystem to use clone_private_mount() in the
following commits. Most of these filesystems like ecryptfs and
cachefiles don't need the MS_UNBDINDABLE check that overlayfs currently
wants. So move the check out of clone_private_mount() and into overlayfs
itself.

Note that overlayfs can probably be switched to not rely on the
MS_UNBDINDABLE check too but for now keep it.

[1]: df820f8de4e4 ("ovl: make private mounts longterm")
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/namespace.c       |  3 ---
 fs/overlayfs/super.c | 13 +++++++++++--
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 7ffefa8b3980..f6efe1272b9d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1976,9 +1976,6 @@ struct vfsmount *clone_private_mount(const struct path *path)
 	struct mount *old_mnt = real_mount(path->mnt);
 	struct mount *new_mnt;
 
-	if (IS_MNT_UNBINDABLE(old_mnt))
-		return ERR_PTR(-EINVAL);
-
 	new_mnt = clone_mnt(old_mnt, path->dentry, CL_PRIVATE);
 	if (IS_ERR(new_mnt))
 		return ERR_CAST(new_mnt);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index fdd72f1a9c5e..c942bb1073f6 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -15,6 +15,7 @@
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
 #include <linux/exportfs.h>
+#include "../pnode.h"
 #include "overlayfs.h"
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
@@ -1175,6 +1176,14 @@ static int ovl_report_in_use(struct ovl_fs *ofs, const char *name)
 	}
 }
 
+static inline struct vfsmount *ovl_clone_private_mount(const struct path *path)
+{
+	if (IS_MNT_UNBINDABLE(real_mount(path->mnt)))
+		return ERR_PTR(-EINVAL);
+
+	return clone_private_mount(path);
+}
+
 static int ovl_get_upper(struct super_block *sb, struct ovl_fs *ofs,
 			 struct ovl_layer *upper_layer, struct path *upperpath)
 {
@@ -1201,7 +1210,7 @@ static int ovl_get_upper(struct super_block *sb, struct ovl_fs *ofs,
 	if (err)
 		goto out;
 
-	upper_mnt = clone_private_mount(upperpath);
+	upper_mnt = ovl_clone_private_mount(upperpath);
 	err = PTR_ERR(upper_mnt);
 	if (IS_ERR(upper_mnt)) {
 		pr_err("failed to clone upperpath\n");
@@ -1700,7 +1709,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
 			}
 		}
 
-		mnt = clone_private_mount(&stack[i]);
+		mnt = ovl_clone_private_mount(&stack[i]);
 		err = PTR_ERR(mnt);
 		if (IS_ERR(mnt)) {
 			pr_err("failed to clone lowerpath\n");
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/7] cachefiles: switch to using a private mount
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
                   ` (2 preceding siblings ...)
  2021-04-14 12:37 ` [PATCH 3/7] namespace: move unbindable check out of clone_private_mount() Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  2021-04-14 12:37 ` [PATCH 5/7] cachefiles: extend ro check to " Christian Brauner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

Since [1] we support creating private mounts from a given path's
vfsmount. This makes them very suitable for any filesystem or
filesystem functionality that piggybacks on paths of another filesystem.
Overlayfs and ecryptfs (which I'll port next) are just two such
examples.

Without trying to imply to many similarities cachefiles have one thing
in common with stacking filesystems namely that they also stack on top
of existing paths. These paths are then used as caches for a netfs.
Since private mounts aren't attached in the filesystem the aren't
affected by mount property changes after cachefiles makes use of them.
This seems a rather desirable property as the underlying path can't e.g.
suddenly go from read-write to read-only and in general it means that
cachefiles is always in full control of the underlying mount after the
user has allowed it to be used as a cache (apart from operations that
affect the superblock of course).

Besides that - and probably irrelevant from the perspective of a
cachefiles developer - it also makes things simpler for a variety of
other vfs features. One concrete example is fanotify. When the path->mnt
of the path that is used as a cache has been marked with FAN_MARK_MOUNT
the semantics get tricky as it isn't clear whether the watchers of
path->mnt should get notified about fsnotify events when files are
created by cachefilesd via path->mnt. Using a private mount let's us
elegantly handle this case too and aligns the behavior of stacks created
by overlayfs.

Reading through the codebase cachefiles currently takes path->mnt and
stashes it in cache->mnt. Everytime a cache object needs to be created,
looked-up, or in some other form interacted with cachefiles will create
a custom path comprised of cache->mnt and the relevant dentry it is
interested in:

struct path cachefiles_path = {
        .mnt = cache->mnt,
        .dentry = dentry,
};

So cachefiles already passes the cache->mnt through everywhere so
supporting private mounts with cachefiles is pretty simply. Instead of
recording path->mnt in cache->mnt we simply record a new private mount
we created as a copy of path->mnt via clone_private_mount() in
cache->mnt. The rest is cleanly handled by cachefiles already.

I have tested this patch with afs:

systemctl stop cachefilesd
sudo mount --bind /var/cache/fscache /var/cache/fscache
systemctl start cachefilesd

sudo apt install kafs-client
systemctl start afs.mount
ls -al /afs
ls -al /afs/grand.central.org/software/openafs/1.9.0
md5sum /afs/grand.central.org/software/openafs/1.9.0/openafs-1.9.0-doc.tar.bz2

cat /proc/fs/fscache/stats | grep [1-9]
  Cookies: idx=148 dat=35 spc=0
  Objects: alc=41 nal=0 avl=41 ded=0
  Pages  : mrk=934 unc=0
  Acquire: n=183 nul=0 noc=0 ok=183 nbf=0 oom=0
  Lookups: n=41 neg=41 pos=0 crt=41 tmo=0
  Retrvls: n=19 ok=0 wt=1 nod=19 nbf=0 int=0 oom=0
  Retrvls: ops=19 owt=0 abt=0
  Stores : n=934 ok=934 agn=0 nbf=0 oom=0
  Stores : ops=62 run=996 pgs=934 rxd=934 olm=0
  Ops    : pend=0 run=81 enq=996 can=0 rej=0
  Ops    : ini=953 dfr=0 rel=953 gc=0

umount /afs/grand.central.org
md5sum /afs/grand.central.org/software/openafs/1.9.0/openafs-1.9.0-doc.tar.bz2

cat /proc/fs/fscache/stats | grep [1-9]
  Cookies: idx=152 dat=60 spc=0
  Objects: alc=70 nal=0 avl=70 ded=39
  ChkAux : non=0 ok=25 upd=0 obs=0
  Pages  : mrk=1868 unc=934
  Acquire: n=212 nul=0 noc=0 ok=212 nbf=0 oom=0
  Lookups: n=70 neg=41 pos=29 crt=41 tmo=0
  Relinqs: n=39 nul=0 wcr=0 rtr=0
  Retrvls: n=38 ok=19 wt=2 nod=19 nbf=0 int=0 oom=0
  Retrvls: ops=38 owt=0 abt=0
  Stores : n=934 ok=934 agn=0 nbf=0 oom=0
  Stores : ops=62 run=996 pgs=934 rxd=934 olm=0
  Ops    : pend=0 run=100 enq=996 can=0 rej=0
  Ops    : ini=972 dfr=0 rel=972 gc=0

[1]: c771d683a62e ("vfs: introduce clone_private_mount()")
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cachefs@redhat.com
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/cachefiles/bind.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 38bb7764b454..7ef572d698f0 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -81,7 +81,7 @@ int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args)
 static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 {
 	struct cachefiles_object *fsdef;
-	struct path path;
+	struct path path, cache_path;
 	struct kstatfs stats;
 	struct dentry *graveyard, *cachedir, *root;
 	const struct cred *saved_cred;
@@ -115,16 +115,23 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	if (ret < 0)
 		goto error_open_root;
 
-	cache->mnt = path.mnt;
-	root = path.dentry;
-
-	ret = -EINVAL;
 	if (mnt_user_ns(path.mnt) != &init_user_ns) {
+		ret = -EINVAL;
+		cache->mnt = NULL;
 		pr_warn("File cache on idmapped mounts not supported");
 		goto error_unsupported;
 	}
 
+	cache->mnt = clone_private_mount(&path);
+	if (IS_ERR(cache->mnt)) {
+		ret = PTR_ERR(cache->mnt);
+		cache->mnt = NULL;
+		pr_warn("Failed to create private mount for file cache\n");
+		goto error_unsupported;
+	}
+
 	/* check parameters */
+	root = path.dentry;
 	ret = -EOPNOTSUPP;
 	if (d_is_negative(root) ||
 	    !d_backing_inode(root)->i_op->lookup ||
@@ -144,8 +151,10 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	if (ret < 0)
 		goto error_unsupported;
 
+	cache_path.dentry = path.dentry;
+	cache_path.mnt = cache->mnt;
 	/* get the cache size and blocksize */
-	ret = vfs_statfs(&path, &stats);
+	ret = vfs_statfs(&cache_path, &stats);
 	if (ret < 0)
 		goto error_unsupported;
 
@@ -229,7 +238,12 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 
 	/* done */
 	set_bit(CACHEFILES_READY, &cache->flags);
-	dput(root);
+
+	/*
+	 * We've created a private mount and we've stashed our "cache" and
+	 * "graveyard" dentries so we don't need the path anymore.
+	 */
+	path_put(&path);
 
 	pr_info("File cache on %s registered\n", cache->cache.identifier);
 
@@ -242,11 +256,11 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	dput(cache->graveyard);
 	cache->graveyard = NULL;
 error_unsupported:
-	mntput(cache->mnt);
+	path_put(&path);
+	kern_unmount(cache->mnt);
 	cache->mnt = NULL;
 	dput(fsdef->dentry);
 	fsdef->dentry = NULL;
-	dput(root);
 error_open_root:
 	kmem_cache_free(cachefiles_object_jar, fsdef);
 error_root_object:
@@ -270,7 +284,7 @@ void cachefiles_daemon_unbind(struct cachefiles_cache *cache)
 	}
 
 	dput(cache->graveyard);
-	mntput(cache->mnt);
+	kern_unmount(cache->mnt);
 
 	kfree(cache->rootdirname);
 	kfree(cache->secctx);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/7] cachefiles: extend ro check to private mount
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
                   ` (3 preceding siblings ...)
  2021-04-14 12:37 ` [PATCH 4/7] cachefiles: switch to using a private mount Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  2021-04-14 12:37 ` [PATCH 6/7] ecryptfs: switch to using a " Christian Brauner
  2021-04-14 12:37 ` [PATCH 7/7] ecryptfs: extend ro check to " Christian Brauner
  6 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

So far cachefiles only verified that the superblock wasn't read-only but
didn't check whether the mount was. This made sense when we did not use
a private mount because the read-only state could change at any point.

Now that we have a private mount and mount properties can't change
behind our back extend the read-only check to include the vfsmount.

The __mnt_is_readonly() helper will check both the mount and the
superblock.  Note that before we checked root->d_sb and now we check
mnt->mnt_sb but since we have a matching <vfsmount, dentry> pair here
this is only syntactical change, not a semantic one.

Here's how this works:

mount -o ro --bind /var/cache/fscache/ /var/cache/fscache/

systemctl start cachefilesd
  Job for cachefilesd.service failed because the control process exited with error code.
  See "systemctl status cachefilesd.service" and "journalctl -xe" for details.

dmesg | grep CacheFiles
  [    2.922514] CacheFiles: Loaded
  [  272.206907] CacheFiles: Failed to register: -30

errno 30
  EROFS 30 Read-only file system

Cc: David Howells <dhowells@redhat.com>
Cc: linux-cachefs@redhat.com
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/cachefiles/bind.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c
index 7ef572d698f0..8cf283de4e14 100644
--- a/fs/cachefiles/bind.c
+++ b/fs/cachefiles/bind.c
@@ -141,8 +141,13 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
 	    !root->d_sb->s_op->sync_fs)
 		goto error_unsupported;
 
+	/*
+	 * Verify our mount and superblock aren't read-only.
+	 * Note, while our private mount is guaranteed to not change anymore
+	 * the superblock may still go read-only later.
+	 */
 	ret = -EROFS;
-	if (sb_rdonly(root->d_sb))
+	if (__mnt_is_readonly(cache->mnt))
 		goto error_unsupported;
 
 	/* determine the security of the on-disk cache as this governs
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/7] ecryptfs: switch to using a private mount
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
                   ` (4 preceding siblings ...)
  2021-04-14 12:37 ` [PATCH 5/7] cachefiles: extend ro check to " Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  2021-04-19  5:01   ` Tyler Hicks
  2021-04-14 12:37 ` [PATCH 7/7] ecryptfs: extend ro check to " Christian Brauner
  6 siblings, 1 reply; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

Since [1] we support creating private mounts from a given path's
vfsmount. This makes them very suitable for any filesystem or
filesystem functionality that piggybacks on paths of another filesystem.
Overlayfs, cachefiles, and ecryptfs are three prime examples.

Since private mounts aren't attached in the filesystem they aren't
affected by mount property changes after ecryptfs makes use of them.
This seems a rather desirable property as the underlying path can't e.g.
suddenly go from read-write to read-only and in general it means that
ecryptfs is always in full control of the underlying mount after the
user has allowed it to be used (apart from operations that affect the
superblock of course).

Besides that it also makes things simpler for a variety of other vfs
features. One concrete example is fanotify. When the path->mnt of the
path that is used as a cache has been marked with FAN_MARK_MOUNT the
semantics get tricky as it isn't clear whether the watchers of path->mnt
should get notified about fsnotify events when files are created by
ecryptfs via path->mnt. Using a private mount let's us elegantly
handle this case too and aligns the behavior of stacks created by
overlayfs and cachefiles.

This change comes with a proper simplification in how ecryptfs currently
handles the lower_path it stashes as private information in its
dentries. Currently it always does:

        ecryptfs_set_dentry_private(dentry, dentry_info);
        dentry_info->lower_path.mnt = mntget(path->mnt);
        dentry_info->lower_path.dentry = lower_dentry;

and then during .d_relase() in ecryptfs_d_release():

        path_put(&p->lower_path);

which is odd since afaict path->mnt is guaranteed to be the mnt stashed
during ecryptfs_mount():

        ecryptfs_set_dentry_private(s->s_root, root_info);
        root_info->lower_path = path;

So that mntget() seems somewhat pointless but there might be reasons
that I'm missing in how the interpose logic for ecryptfs works.

While switching to a long-term private mount via clone_private_mount()
let's get rid of the gratuitous mntget() and mntput()/path_put().
Instead, stash away the private mount in ecryptfs' s_fs_info and call
kern_unmount() in .kill_sb() so we only take the mntput() hit once.

I've added a WARN_ON_ONCE() into ecryptfs_lookup_interpose() triggering
if the stashed private mount and the path's mount don't match. I think
that would be a proper bug even without that clone_private_mount()
change in this patch.

[1]: c771d683a62e ("vfs: introduce clone_private_mount()")
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Tyler Hicks <code@tyhicks.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: ecryptfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/ecryptfs/dentry.c          |  6 +++++-
 fs/ecryptfs/ecryptfs_kernel.h |  9 +++++++++
 fs/ecryptfs/inode.c           |  5 ++++-
 fs/ecryptfs/main.c            | 29 ++++++++++++++++++++++++-----
 4 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/fs/ecryptfs/dentry.c b/fs/ecryptfs/dentry.c
index 44606f079efb..e5edafa165d4 100644
--- a/fs/ecryptfs/dentry.c
+++ b/fs/ecryptfs/dentry.c
@@ -67,7 +67,11 @@ static void ecryptfs_d_release(struct dentry *dentry)
 {
 	struct ecryptfs_dentry_info *p = dentry->d_fsdata;
 	if (p) {
-		path_put(&p->lower_path);
+		/*
+		 * p->lower_path.mnt is a private mount which will be released
+		 * when the superblock shuts down so we only need to dput here.
+		 */
+		dput(p->lower_path.dentry);
 		call_rcu(&p->rcu, ecryptfs_dentry_free_rcu);
 	}
 }
diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h
index e6ac78c62ca4..f89d0f7bb3fe 100644
--- a/fs/ecryptfs/ecryptfs_kernel.h
+++ b/fs/ecryptfs/ecryptfs_kernel.h
@@ -352,6 +352,7 @@ struct ecryptfs_mount_crypt_stat {
 struct ecryptfs_sb_info {
 	struct super_block *wsi_sb;
 	struct ecryptfs_mount_crypt_stat mount_crypt_stat;
+	struct vfsmount *mnt;
 };
 
 /* file private data. */
@@ -496,6 +497,14 @@ ecryptfs_set_superblock_lower(struct super_block *sb,
 	((struct ecryptfs_sb_info *)sb->s_fs_info)->wsi_sb = lower_sb;
 }
 
+static inline void
+ecryptfs_set_superblock_lower_mnt(struct super_block *sb,
+				  struct vfsmount *mnt)
+{
+	struct ecryptfs_sb_info *sbi = sb->s_fs_info;
+	sbi->mnt = mnt;
+}
+
 static inline struct ecryptfs_dentry_info *
 ecryptfs_dentry_to_private(struct dentry *dentry)
 {
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 18e9285fbb4c..204df4bf476d 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -324,6 +324,7 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,
 				     struct dentry *lower_dentry)
 {
 	struct path *path = ecryptfs_dentry_to_lower_path(dentry->d_parent);
+	struct ecryptfs_sb_info *sb_info = ecryptfs_superblock_to_private(dentry->d_sb);
 	struct inode *inode, *lower_inode;
 	struct ecryptfs_dentry_info *dentry_info;
 	int rc = 0;
@@ -339,7 +340,9 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,
 	BUG_ON(!d_count(lower_dentry));
 
 	ecryptfs_set_dentry_private(dentry, dentry_info);
-	dentry_info->lower_path.mnt = mntget(path->mnt);
+	/* Warn if we somehow ended up with an unexpected path. */
+	WARN_ON_ONCE(path->mnt != sb_info->mnt);
+	dentry_info->lower_path.mnt = path->mnt;
 	dentry_info->lower_path.dentry = lower_dentry;
 
 	/*
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index cdf40a54a35d..3ba2c0f349a3 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -476,6 +476,7 @@ static struct file_system_type ecryptfs_fs_type;
 static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags,
 			const char *dev_name, void *raw_data)
 {
+	struct vfsmount *mnt;
 	struct super_block *s;
 	struct ecryptfs_sb_info *sbi;
 	struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
@@ -537,6 +538,16 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 		goto out_free;
 	}
 
+	mnt = clone_private_mount(&path);
+	if (IS_ERR(mnt)) {
+		rc = PTR_ERR(mnt);
+		pr_warn("Failed to create private mount for ecryptfs\n");
+		goto out_free;
+	}
+
+	/* Record our long-term lower mount. */
+	ecryptfs_set_superblock_lower_mnt(s, mnt);
+
 	if (check_ruid && !uid_eq(d_inode(path.dentry)->i_uid, current_uid())) {
 		rc = -EPERM;
 		printk(KERN_ERR "Mount of device (uid: %d) not owned by "
@@ -590,9 +601,15 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 	if (!root_info)
 		goto out_free;
 
+	/* Use our private mount from now on. */
+	root_info->lower_path.mnt = mnt;
+	root_info->lower_path.dentry = dget(path.dentry);
+
+	/* We created a private clone of this mount above so drop the path. */
+	path_put(&path);
+
 	/* ->kill_sb() will take care of root_info */
 	ecryptfs_set_dentry_private(s->s_root, root_info);
-	root_info->lower_path = path;
 
 	s->s_flags |= SB_ACTIVE;
 	return dget(s->s_root);
@@ -619,11 +636,13 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 static void ecryptfs_kill_block_super(struct super_block *sb)
 {
 	struct ecryptfs_sb_info *sb_info = ecryptfs_superblock_to_private(sb);
+
 	kill_anon_super(sb);
-	if (!sb_info)
-		return;
-	ecryptfs_destroy_mount_crypt_stat(&sb_info->mount_crypt_stat);
-	kmem_cache_free(ecryptfs_sb_info_cache, sb_info);
+	if (sb_info) {
+		kern_unmount(sb_info->mnt);
+		ecryptfs_destroy_mount_crypt_stat(&sb_info->mount_crypt_stat);
+		kmem_cache_free(ecryptfs_sb_info_cache, sb_info);
+	}
 }
 
 static struct file_system_type ecryptfs_fs_type = {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/7] ecryptfs: extend ro check to private mount
  2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
                   ` (5 preceding siblings ...)
  2021-04-14 12:37 ` [PATCH 6/7] ecryptfs: switch to using a " Christian Brauner
@ 2021-04-14 12:37 ` Christian Brauner
  6 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2021-04-14 12:37 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Amir Goldstein, Christoph Hellwig, Tyler Hicks, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

From: Christian Brauner <christian.brauner@ubuntu.com>

So far ecryptfs only verified that the superblock wasn't read-only but
didn't check whether the mount was. This made sense when we did not use
a private mount because the read-only state could change at any point.

Now that we have a private mount and mount properties can't change
behind our back extend the read-only check to include the vfsmount.

The __mnt_is_readonly() helper will check both the mount and the
superblock.  Note that before we checked root->d_sb and now we check
mnt->mnt_sb but since we have a matching <vfsmount, dentry> pair here
this is only syntactical change, not a semantic one.

Overlayfs and cachefiles have been changed to check this as well.

Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Tyler Hicks <code@tyhicks.com>
Cc: ecryptfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 fs/ecryptfs/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 3ba2c0f349a3..4e5aeec91e95 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -571,7 +571,7 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 	 *   1) The lower mount is ro
 	 *   2) The ecryptfs_encrypted_view mount option is specified
 	 */
-	if (sb_rdonly(path.dentry->d_sb) || mount_crypt_stat->flags & ECRYPTFS_ENCRYPTED_VIEW_ENABLED)
+	if (__mnt_is_readonly(mnt) || mount_crypt_stat->flags & ECRYPTFS_ENCRYPTED_VIEW_ENABLED)
 		s->s_flags |= SB_RDONLY;
 
 	s->s_maxbytes = path.dentry->d_sb->s_maxbytes;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 6/7] ecryptfs: switch to using a private mount
  2021-04-14 12:37 ` [PATCH 6/7] ecryptfs: switch to using a " Christian Brauner
@ 2021-04-19  5:01   ` Tyler Hicks
  2021-04-22 20:46     ` Tyler Hicks
  0 siblings, 1 reply; 10+ messages in thread
From: Tyler Hicks @ 2021-04-19  5:01 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-fsdevel, Amir Goldstein, Christoph Hellwig, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

On 2021-04-14 14:37:50, Christian Brauner wrote:
> From: Christian Brauner <christian.brauner@ubuntu.com>
> 
> Since [1] we support creating private mounts from a given path's
> vfsmount. This makes them very suitable for any filesystem or
> filesystem functionality that piggybacks on paths of another filesystem.
> Overlayfs, cachefiles, and ecryptfs are three prime examples.
> 
> Since private mounts aren't attached in the filesystem they aren't
> affected by mount property changes after ecryptfs makes use of them.
> This seems a rather desirable property as the underlying path can't e.g.
> suddenly go from read-write to read-only and in general it means that
> ecryptfs is always in full control of the underlying mount after the
> user has allowed it to be used (apart from operations that affect the
> superblock of course).
> 
> Besides that it also makes things simpler for a variety of other vfs
> features. One concrete example is fanotify. When the path->mnt of the
> path that is used as a cache has been marked with FAN_MARK_MOUNT the
> semantics get tricky as it isn't clear whether the watchers of path->mnt
> should get notified about fsnotify events when files are created by
> ecryptfs via path->mnt. Using a private mount let's us elegantly
> handle this case too and aligns the behavior of stacks created by
> overlayfs and cachefiles.
> 
> This change comes with a proper simplification in how ecryptfs currently
> handles the lower_path it stashes as private information in its
> dentries. Currently it always does:
> 
>         ecryptfs_set_dentry_private(dentry, dentry_info);
>         dentry_info->lower_path.mnt = mntget(path->mnt);
>         dentry_info->lower_path.dentry = lower_dentry;
> 
> and then during .d_relase() in ecryptfs_d_release():
> 
>         path_put(&p->lower_path);
> 
> which is odd since afaict path->mnt is guaranteed to be the mnt stashed
> during ecryptfs_mount():
> 
>         ecryptfs_set_dentry_private(s->s_root, root_info);
>         root_info->lower_path = path;
> 
> So that mntget() seems somewhat pointless but there might be reasons
> that I'm missing in how the interpose logic for ecryptfs works.
> 
> While switching to a long-term private mount via clone_private_mount()
> let's get rid of the gratuitous mntget() and mntput()/path_put().
> Instead, stash away the private mount in ecryptfs' s_fs_info and call
> kern_unmount() in .kill_sb() so we only take the mntput() hit once.
> 
> I've added a WARN_ON_ONCE() into ecryptfs_lookup_interpose() triggering
> if the stashed private mount and the path's mount don't match. I think
> that would be a proper bug even without that clone_private_mount()
> change in this patch.
> 
> [1]: c771d683a62e ("vfs: introduce clone_private_mount()")
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Tyler Hicks <code@tyhicks.com>
> Cc: Miklos Szeredi <mszeredi@redhat.com>
> Cc: ecryptfs@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

This patch and the following one both look technically correct to me. I
do want to spend a little time manually testing these changes, though.
I'm hoping to have that done by end of day Tuesday.

Tyler

> ---
>  fs/ecryptfs/dentry.c          |  6 +++++-
>  fs/ecryptfs/ecryptfs_kernel.h |  9 +++++++++
>  fs/ecryptfs/inode.c           |  5 ++++-
>  fs/ecryptfs/main.c            | 29 ++++++++++++++++++++++++-----
>  4 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ecryptfs/dentry.c b/fs/ecryptfs/dentry.c
> index 44606f079efb..e5edafa165d4 100644
> --- a/fs/ecryptfs/dentry.c
> +++ b/fs/ecryptfs/dentry.c
> @@ -67,7 +67,11 @@ static void ecryptfs_d_release(struct dentry *dentry)
>  {
>  	struct ecryptfs_dentry_info *p = dentry->d_fsdata;
>  	if (p) {
> -		path_put(&p->lower_path);
> +		/*
> +		 * p->lower_path.mnt is a private mount which will be released
> +		 * when the superblock shuts down so we only need to dput here.
> +		 */
> +		dput(p->lower_path.dentry);
>  		call_rcu(&p->rcu, ecryptfs_dentry_free_rcu);
>  	}
>  }
> diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h
> index e6ac78c62ca4..f89d0f7bb3fe 100644
> --- a/fs/ecryptfs/ecryptfs_kernel.h
> +++ b/fs/ecryptfs/ecryptfs_kernel.h
> @@ -352,6 +352,7 @@ struct ecryptfs_mount_crypt_stat {
>  struct ecryptfs_sb_info {
>  	struct super_block *wsi_sb;
>  	struct ecryptfs_mount_crypt_stat mount_crypt_stat;
> +	struct vfsmount *mnt;
>  };
>  
>  /* file private data. */
> @@ -496,6 +497,14 @@ ecryptfs_set_superblock_lower(struct super_block *sb,
>  	((struct ecryptfs_sb_info *)sb->s_fs_info)->wsi_sb = lower_sb;
>  }
>  
> +static inline void
> +ecryptfs_set_superblock_lower_mnt(struct super_block *sb,
> +				  struct vfsmount *mnt)
> +{
> +	struct ecryptfs_sb_info *sbi = sb->s_fs_info;
> +	sbi->mnt = mnt;
> +}
> +
>  static inline struct ecryptfs_dentry_info *
>  ecryptfs_dentry_to_private(struct dentry *dentry)
>  {
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index 18e9285fbb4c..204df4bf476d 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -324,6 +324,7 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,
>  				     struct dentry *lower_dentry)
>  {
>  	struct path *path = ecryptfs_dentry_to_lower_path(dentry->d_parent);
> +	struct ecryptfs_sb_info *sb_info = ecryptfs_superblock_to_private(dentry->d_sb);
>  	struct inode *inode, *lower_inode;
>  	struct ecryptfs_dentry_info *dentry_info;
>  	int rc = 0;
> @@ -339,7 +340,9 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,
>  	BUG_ON(!d_count(lower_dentry));
>  
>  	ecryptfs_set_dentry_private(dentry, dentry_info);
> -	dentry_info->lower_path.mnt = mntget(path->mnt);
> +	/* Warn if we somehow ended up with an unexpected path. */
> +	WARN_ON_ONCE(path->mnt != sb_info->mnt);
> +	dentry_info->lower_path.mnt = path->mnt;
>  	dentry_info->lower_path.dentry = lower_dentry;
>  
>  	/*
> diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
> index cdf40a54a35d..3ba2c0f349a3 100644
> --- a/fs/ecryptfs/main.c
> +++ b/fs/ecryptfs/main.c
> @@ -476,6 +476,7 @@ static struct file_system_type ecryptfs_fs_type;
>  static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags,
>  			const char *dev_name, void *raw_data)
>  {
> +	struct vfsmount *mnt;
>  	struct super_block *s;
>  	struct ecryptfs_sb_info *sbi;
>  	struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
> @@ -537,6 +538,16 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
>  		goto out_free;
>  	}
>  
> +	mnt = clone_private_mount(&path);
> +	if (IS_ERR(mnt)) {
> +		rc = PTR_ERR(mnt);
> +		pr_warn("Failed to create private mount for ecryptfs\n");
> +		goto out_free;
> +	}
> +
> +	/* Record our long-term lower mount. */
> +	ecryptfs_set_superblock_lower_mnt(s, mnt);
> +
>  	if (check_ruid && !uid_eq(d_inode(path.dentry)->i_uid, current_uid())) {
>  		rc = -EPERM;
>  		printk(KERN_ERR "Mount of device (uid: %d) not owned by "
> @@ -590,9 +601,15 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
>  	if (!root_info)
>  		goto out_free;
>  
> +	/* Use our private mount from now on. */
> +	root_info->lower_path.mnt = mnt;
> +	root_info->lower_path.dentry = dget(path.dentry);
> +
> +	/* We created a private clone of this mount above so drop the path. */
> +	path_put(&path);
> +
>  	/* ->kill_sb() will take care of root_info */
>  	ecryptfs_set_dentry_private(s->s_root, root_info);
> -	root_info->lower_path = path;
>  
>  	s->s_flags |= SB_ACTIVE;
>  	return dget(s->s_root);
> @@ -619,11 +636,13 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
>  static void ecryptfs_kill_block_super(struct super_block *sb)
>  {
>  	struct ecryptfs_sb_info *sb_info = ecryptfs_superblock_to_private(sb);
> +
>  	kill_anon_super(sb);
> -	if (!sb_info)
> -		return;
> -	ecryptfs_destroy_mount_crypt_stat(&sb_info->mount_crypt_stat);
> -	kmem_cache_free(ecryptfs_sb_info_cache, sb_info);
> +	if (sb_info) {
> +		kern_unmount(sb_info->mnt);
> +		ecryptfs_destroy_mount_crypt_stat(&sb_info->mount_crypt_stat);
> +		kmem_cache_free(ecryptfs_sb_info_cache, sb_info);
> +	}
>  }
>  
>  static struct file_system_type ecryptfs_fs_type = {
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 6/7] ecryptfs: switch to using a private mount
  2021-04-19  5:01   ` Tyler Hicks
@ 2021-04-22 20:46     ` Tyler Hicks
  0 siblings, 0 replies; 10+ messages in thread
From: Tyler Hicks @ 2021-04-22 20:46 UTC (permalink / raw)
  To: Christian Brauner, John Johansen
  Cc: linux-fsdevel, Amir Goldstein, Christoph Hellwig, David Howells,
	Miklos Szeredi, Al Viro, ecryptfs, linux-cachefs,
	Christian Brauner

On 2021-04-19 00:01:28, Tyler Hicks wrote:
> On 2021-04-14 14:37:50, Christian Brauner wrote:
> > From: Christian Brauner <christian.brauner@ubuntu.com>
> > 
> > Since [1] we support creating private mounts from a given path's
> > vfsmount. This makes them very suitable for any filesystem or
> > filesystem functionality that piggybacks on paths of another filesystem.
> > Overlayfs, cachefiles, and ecryptfs are three prime examples.
> > 
> > Since private mounts aren't attached in the filesystem they aren't
> > affected by mount property changes after ecryptfs makes use of them.
> > This seems a rather desirable property as the underlying path can't e.g.
> > suddenly go from read-write to read-only and in general it means that
> > ecryptfs is always in full control of the underlying mount after the
> > user has allowed it to be used (apart from operations that affect the
> > superblock of course).
> > 
> > Besides that it also makes things simpler for a variety of other vfs
> > features. One concrete example is fanotify. When the path->mnt of the
> > path that is used as a cache has been marked with FAN_MARK_MOUNT the
> > semantics get tricky as it isn't clear whether the watchers of path->mnt
> > should get notified about fsnotify events when files are created by
> > ecryptfs via path->mnt. Using a private mount let's us elegantly
> > handle this case too and aligns the behavior of stacks created by
> > overlayfs and cachefiles.
> > 
> > This change comes with a proper simplification in how ecryptfs currently
> > handles the lower_path it stashes as private information in its
> > dentries. Currently it always does:
> > 
> >         ecryptfs_set_dentry_private(dentry, dentry_info);
> >         dentry_info->lower_path.mnt = mntget(path->mnt);
> >         dentry_info->lower_path.dentry = lower_dentry;
> > 
> > and then during .d_relase() in ecryptfs_d_release():
> > 
> >         path_put(&p->lower_path);
> > 
> > which is odd since afaict path->mnt is guaranteed to be the mnt stashed
> > during ecryptfs_mount():
> > 
> >         ecryptfs_set_dentry_private(s->s_root, root_info);
> >         root_info->lower_path = path;
> > 
> > So that mntget() seems somewhat pointless but there might be reasons
> > that I'm missing in how the interpose logic for ecryptfs works.
> > 
> > While switching to a long-term private mount via clone_private_mount()
> > let's get rid of the gratuitous mntget() and mntput()/path_put().
> > Instead, stash away the private mount in ecryptfs' s_fs_info and call
> > kern_unmount() in .kill_sb() so we only take the mntput() hit once.
> > 
> > I've added a WARN_ON_ONCE() into ecryptfs_lookup_interpose() triggering
> > if the stashed private mount and the path's mount don't match. I think
> > that would be a proper bug even without that clone_private_mount()
> > change in this patch.
> > 
> > [1]: c771d683a62e ("vfs: introduce clone_private_mount()")
> > Cc: Amir Goldstein <amir73il@gmail.com>
> > Cc: Tyler Hicks <code@tyhicks.com>
> > Cc: Miklos Szeredi <mszeredi@redhat.com>
> > Cc: ecryptfs@vger.kernel.org
> > Cc: linux-fsdevel@vger.kernel.org
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> 
> This patch and the following one both look technically correct to me. I
> do want to spend a little time manually testing these changes, though.
> I'm hoping to have that done by end of day Tuesday.

Hey Christian - I've finished testing these eCryptfs changes. I really
like some of the new properties:

 - As you mentioned, the ability for the lower mount to go read-only and
   not affect the upper mount is nice. I verified this with:
    $ echo foo > /upper/foo
    $ mount -o remount,bind,ro /lower
    $ echo bar > /upper/bar
 - I like that the lower filesystem can be entirely unmounted after the
   eCryptfs mount has been set up. This could be useful if someone
   wanted to prevent tampering with the lower filesystem while the
   eCryptfs mount is active.

Unfortunately, I think there may be temporary a blocker here. eCryptfs'
main user base has historically been Ubuntu users since Ubuntu embraced
it for encrypted home directories years ago. While encrypted home
directories are no longer supported out-of-the-box in new Ubuntu
releases, I suspect that's still the distro that most eCryptfs users use
today.

We know that AppArmor is the default LSM in use in Ubuntu. However,
AppArmor does not know how to handle private mounts. This was most
recently discussed in the unprivileged overlay mounts thread:

 https://lore.kernel.org/linux-security-module/9b8236eb-b3c4-6e0f-edb8-833172c7c2c7@canonical.com/

When an AppArmor confined process is interacting with an eCryptfs mount,
I see disconnected path denials from AppArmor for the lower paths when
your private mount patches are applied:

 audit: type=1400 audit(1619121587.568:50): apparmor="DENIED" operation="open" info="Failed name lookup - disconnected path" error=-13 profile="privatemnt" name="foo" pid=5992 comm="bash" requested_mask="wr" denied_mask="wr" fsuid=1000 ouid=1000

I'd rather wait for AppArmor to better handle private mounts because I
think existing users will definitely see a negative impact from these
changes and I don't think that the positive user-facing impacts outweigh
the negative.

I don't see any related changes in the AppArmor tree but I've cc'ed John
in case he's made any progress here.

Tyler

> Tyler
> 
> > ---
> >  fs/ecryptfs/dentry.c          |  6 +++++-
> >  fs/ecryptfs/ecryptfs_kernel.h |  9 +++++++++
> >  fs/ecryptfs/inode.c           |  5 ++++-
> >  fs/ecryptfs/main.c            | 29 ++++++++++++++++++++++++-----
> >  4 files changed, 42 insertions(+), 7 deletions(-)
> > 
> > diff --git a/fs/ecryptfs/dentry.c b/fs/ecryptfs/dentry.c
> > index 44606f079efb..e5edafa165d4 100644
> > --- a/fs/ecryptfs/dentry.c
> > +++ b/fs/ecryptfs/dentry.c
> > @@ -67,7 +67,11 @@ static void ecryptfs_d_release(struct dentry *dentry)
> >  {
> >  	struct ecryptfs_dentry_info *p = dentry->d_fsdata;
> >  	if (p) {
> > -		path_put(&p->lower_path);
> > +		/*
> > +		 * p->lower_path.mnt is a private mount which will be released
> > +		 * when the superblock shuts down so we only need to dput here.
> > +		 */
> > +		dput(p->lower_path.dentry);
> >  		call_rcu(&p->rcu, ecryptfs_dentry_free_rcu);
> >  	}
> >  }
> > diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h
> > index e6ac78c62ca4..f89d0f7bb3fe 100644
> > --- a/fs/ecryptfs/ecryptfs_kernel.h
> > +++ b/fs/ecryptfs/ecryptfs_kernel.h
> > @@ -352,6 +352,7 @@ struct ecryptfs_mount_crypt_stat {
> >  struct ecryptfs_sb_info {
> >  	struct super_block *wsi_sb;
> >  	struct ecryptfs_mount_crypt_stat mount_crypt_stat;
> > +	struct vfsmount *mnt;
> >  };
> >  
> >  /* file private data. */
> > @@ -496,6 +497,14 @@ ecryptfs_set_superblock_lower(struct super_block *sb,
> >  	((struct ecryptfs_sb_info *)sb->s_fs_info)->wsi_sb = lower_sb;
> >  }
> >  
> > +static inline void
> > +ecryptfs_set_superblock_lower_mnt(struct super_block *sb,
> > +				  struct vfsmount *mnt)
> > +{
> > +	struct ecryptfs_sb_info *sbi = sb->s_fs_info;
> > +	sbi->mnt = mnt;
> > +}
> > +
> >  static inline struct ecryptfs_dentry_info *
> >  ecryptfs_dentry_to_private(struct dentry *dentry)
> >  {
> > diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> > index 18e9285fbb4c..204df4bf476d 100644
> > --- a/fs/ecryptfs/inode.c
> > +++ b/fs/ecryptfs/inode.c
> > @@ -324,6 +324,7 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,
> >  				     struct dentry *lower_dentry)
> >  {
> >  	struct path *path = ecryptfs_dentry_to_lower_path(dentry->d_parent);
> > +	struct ecryptfs_sb_info *sb_info = ecryptfs_superblock_to_private(dentry->d_sb);
> >  	struct inode *inode, *lower_inode;
> >  	struct ecryptfs_dentry_info *dentry_info;
> >  	int rc = 0;
> > @@ -339,7 +340,9 @@ static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,
> >  	BUG_ON(!d_count(lower_dentry));
> >  
> >  	ecryptfs_set_dentry_private(dentry, dentry_info);
> > -	dentry_info->lower_path.mnt = mntget(path->mnt);
> > +	/* Warn if we somehow ended up with an unexpected path. */
> > +	WARN_ON_ONCE(path->mnt != sb_info->mnt);
> > +	dentry_info->lower_path.mnt = path->mnt;
> >  	dentry_info->lower_path.dentry = lower_dentry;
> >  
> >  	/*
> > diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
> > index cdf40a54a35d..3ba2c0f349a3 100644
> > --- a/fs/ecryptfs/main.c
> > +++ b/fs/ecryptfs/main.c
> > @@ -476,6 +476,7 @@ static struct file_system_type ecryptfs_fs_type;
> >  static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags,
> >  			const char *dev_name, void *raw_data)
> >  {
> > +	struct vfsmount *mnt;
> >  	struct super_block *s;
> >  	struct ecryptfs_sb_info *sbi;
> >  	struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
> > @@ -537,6 +538,16 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
> >  		goto out_free;
> >  	}
> >  
> > +	mnt = clone_private_mount(&path);
> > +	if (IS_ERR(mnt)) {
> > +		rc = PTR_ERR(mnt);
> > +		pr_warn("Failed to create private mount for ecryptfs\n");
> > +		goto out_free;
> > +	}
> > +
> > +	/* Record our long-term lower mount. */
> > +	ecryptfs_set_superblock_lower_mnt(s, mnt);
> > +
> >  	if (check_ruid && !uid_eq(d_inode(path.dentry)->i_uid, current_uid())) {
> >  		rc = -EPERM;
> >  		printk(KERN_ERR "Mount of device (uid: %d) not owned by "
> > @@ -590,9 +601,15 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
> >  	if (!root_info)
> >  		goto out_free;
> >  
> > +	/* Use our private mount from now on. */
> > +	root_info->lower_path.mnt = mnt;
> > +	root_info->lower_path.dentry = dget(path.dentry);
> > +
> > +	/* We created a private clone of this mount above so drop the path. */
> > +	path_put(&path);
> > +
> >  	/* ->kill_sb() will take care of root_info */
> >  	ecryptfs_set_dentry_private(s->s_root, root_info);
> > -	root_info->lower_path = path;
> >  
> >  	s->s_flags |= SB_ACTIVE;
> >  	return dget(s->s_root);
> > @@ -619,11 +636,13 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
> >  static void ecryptfs_kill_block_super(struct super_block *sb)
> >  {
> >  	struct ecryptfs_sb_info *sb_info = ecryptfs_superblock_to_private(sb);
> > +
> >  	kill_anon_super(sb);
> > -	if (!sb_info)
> > -		return;
> > -	ecryptfs_destroy_mount_crypt_stat(&sb_info->mount_crypt_stat);
> > -	kmem_cache_free(ecryptfs_sb_info_cache, sb_info);
> > +	if (sb_info) {
> > +		kern_unmount(sb_info->mnt);
> > +		ecryptfs_destroy_mount_crypt_stat(&sb_info->mount_crypt_stat);
> > +		kmem_cache_free(ecryptfs_sb_info_cache, sb_info);
> > +	}
> >  }
> >  
> >  static struct file_system_type ecryptfs_fs_type = {
> > -- 
> > 2.27.0
> > 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-04-22 20:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14 12:37 [PATCH 0/7] fs: tweak and switch more fses to private mounts Christian Brauner
2021-04-14 12:37 ` [PATCH 1/7] namespace: fix clone_private_mount() kernel doc Christian Brauner
2021-04-14 12:37 ` [PATCH 2/7] namespace: add kernel doc for mnt_clone_internal() Christian Brauner
2021-04-14 12:37 ` [PATCH 3/7] namespace: move unbindable check out of clone_private_mount() Christian Brauner
2021-04-14 12:37 ` [PATCH 4/7] cachefiles: switch to using a private mount Christian Brauner
2021-04-14 12:37 ` [PATCH 5/7] cachefiles: extend ro check to " Christian Brauner
2021-04-14 12:37 ` [PATCH 6/7] ecryptfs: switch to using a " Christian Brauner
2021-04-19  5:01   ` Tyler Hicks
2021-04-22 20:46     ` Tyler Hicks
2021-04-14 12:37 ` [PATCH 7/7] ecryptfs: extend ro check to " Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).