linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	linux-unionfs@vger.kernel.org, linux-cachefs@redhat.com,
	dhowells@redhat.com, Christoph Hellwig <hch@infradead.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Jeff Layton <jlayton@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	torvalds@linux-foundation.org, linux-unionfs@vger.kernel.org,
	linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 3/5] cachefiles: Split removal-prevention from S_KERNEL_FILE and extend effects
Date: Mon, 31 Jan 2022 15:13:52 +0000	[thread overview]
Message-ID: <164364203196.1476539.12557909603374068848.stgit@warthog.procyon.org.uk> (raw)
In-Reply-To: <164364196407.1476539.8450117784231043601.stgit@warthog.procyon.org.uk>

Split removal-prevention from the S_KERNEL_FILE flag as it's really a
separate job and it should affect unlnk and rename too, not just rmdir[1].
This new flag is called I_NO_REMOVE and moved to inode->i_state.

If this is set, the file or directory may not be removed, renamed or
unlinked.  This can then be used by cachefiles to prevent userspace
removing or renaming files and directories that the are being used.  It
could also be used by overlayfs to prevent fiddling in its work
directories.

The directory layout in its cache is very important to cachefiles as the
names are how it indexes the contents.  Removing objects can cause
cachefilesd to malfunction as it may find it can't reach bits of the
structure that it previously created and still has dentry pointers to.

This also closes a race between cachefilesd trying to cull an empty
directory and cachefiles trying to create something in it.

Amir Goldstein suggested that the check in vfs_rmdir() should be moved to
may_delete()[1], but it really needs to be done whilst the inode lock is
held.

I_NO_REMOVE should only be test/set/cleared under the inode lock without
the lock being dropped between that and the VFS operations that might be
affected by it lest races occur.

Note also that I_NO_REMOVE will prevent vfs_unlink() vfs_rmdir() and
vfs_rename() from operating on a file.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Amir Goldstein <amir73il@gmail.com>
cc: Miklos Szeredi <miklos@szeredi.hu>
cc: linux-unionfs@vger.kernel.org
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/CAOQ4uxjEcvffv=rNXS-r+NLz+=6yk4abRuX_AMq9v-M4nf_PtA@mail.gmail.com [1]
---

 fs/cachefiles/namei.c |   12 ++++++++++--
 fs/namei.c            |    8 +++++---
 include/linux/fs.h    |    4 ++++
 3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index f256c8aff7bb..8930c767d93a 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -20,8 +20,10 @@ static bool __cachefiles_mark_inode_in_use(struct cachefiles_object *object,
 	struct inode *inode = d_backing_inode(dentry);
 	bool can_use = false;
 
+	spin_lock(&inode->i_lock);
 	if (!(inode->i_flags & S_KERNEL_FILE)) {
 		inode->i_flags |= S_KERNEL_FILE;
+		inode->i_state |= I_NO_REMOVE;
 		trace_cachefiles_mark_active(object, inode);
 		can_use = true;
 	} else {
@@ -29,6 +31,7 @@ static bool __cachefiles_mark_inode_in_use(struct cachefiles_object *object,
 		pr_notice("cachefiles: Inode already in use: %pd (B=%lx)\n",
 			  dentry, inode->i_ino);
 	}
+	spin_unlock(&inode->i_lock);
 
 	return can_use;
 }
@@ -53,7 +56,10 @@ static void __cachefiles_unmark_inode_in_use(struct cachefiles_object *object,
 {
 	struct inode *inode = d_backing_inode(dentry);
 
+	spin_lock(&inode->i_lock);
 	inode->i_flags &= ~S_KERNEL_FILE;
+	inode->i_state &= ~I_NO_REMOVE;
+	spin_unlock(&inode->i_lock);
 	trace_cachefiles_mark_inactive(object, inode);
 }
 
@@ -392,8 +398,10 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 		};
 		trace_cachefiles_rename(object, d_inode(rep)->i_ino, why);
 		ret = cachefiles_inject_read_error();
-		if (ret == 0)
+		if (ret == 0) {
+			__cachefiles_unmark_inode_in_use(object, rep);
 			ret = vfs_rename(&rd);
+		}
 		if (ret != 0)
 			trace_cachefiles_vfs_error(object, d_inode(dir), ret,
 						   cachefiles_trace_rename_error);
@@ -402,7 +410,6 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
 					    "Rename failed with error %d", ret);
 	}
 
-	__cachefiles_unmark_inode_in_use(object, rep);
 	unlock_rename(cache->graveyard, dir);
 	dput(grave);
 	_leave(" = 0");
@@ -426,6 +433,7 @@ int cachefiles_delete_object(struct cachefiles_object *object,
 	dget(dentry);
 
 	inode_lock_nested(d_backing_inode(fan), I_MUTEX_PARENT);
+	cachefiles_unmark_inode_in_use(object, object->file);
 	ret = cachefiles_unlink(volume->cache, object, fan, dentry, why);
 	inode_unlock(d_backing_inode(fan));
 	dput(dentry);
diff --git a/fs/namei.c b/fs/namei.c
index 3f1829b3ab5b..ea17377794d3 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4009,7 +4009,7 @@ int vfs_rmdir(struct user_namespace *mnt_userns, struct inode *dir,
 
 	error = -EBUSY;
 	if (is_local_mountpoint(dentry) ||
-	    (dentry->d_inode->i_flags & S_KERNEL_FILE))
+	    (dentry->d_inode->i_flags & I_NO_REMOVE))
 		goto out;
 
 	error = security_inode_rmdir(dir, dentry);
@@ -4139,7 +4139,8 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
 	inode_lock(target);
 	if (IS_SWAPFILE(target))
 		error = -EPERM;
-	else if (is_local_mountpoint(dentry))
+	else if (is_local_mountpoint(dentry) ||
+		 (dentry->d_inode->i_flags & I_NO_REMOVE))
 		error = -EBUSY;
 	else {
 		error = security_inode_unlink(dir, dentry);
@@ -4653,7 +4654,8 @@ int vfs_rename(struct renamedata *rd)
 		goto out;
 
 	error = -EBUSY;
-	if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
+	if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry) ||
+	    (old_dentry->d_inode->i_flags & I_NO_REMOVE))
 		goto out;
 
 	if (max_links && new_dir != old_dir) {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f461883d66a8..a273d5cde731 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2332,6 +2332,9 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
  * I_SYNC_QUEUED	Inode is queued in b_io or b_more_io writeback lists.
  *			Used to detect that mark_inode_dirty() should not move
  * 			inode between dirty lists.
+ * I_NO_REMOVE		Unlink, rmdir and rename are not allowed to remove the
+ *			object or any of its hard links.
+ *
  *
  * I_PINNING_FSCACHE_WB	Inode is pinning an fscache object for writeback.
  *
@@ -2358,6 +2361,7 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
 #define I_DONTCACHE		(1 << 16)
 #define I_SYNC_QUEUED		(1 << 17)
 #define I_PINNING_FSCACHE_WB	(1 << 18)
+#define I_NO_REMOVE		(1 << 19)
 
 #define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC)
 #define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES)



  parent reply	other threads:[~2022-01-31 15:14 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-31 15:12 [RFC][PATCH 0/5] vfs, overlayfs, cachefiles: Combine I_OVL_INUSE and S_KERNEL_FILE and split out no-remove David Howells
2022-01-31 15:12 ` [PATCH 1/5] vfs, overlayfs, cachefiles: Turn I_OVL_INUSE into something generic David Howells
2022-01-31 15:29   ` Amir Goldstein
2022-01-31 15:32   ` David Howells
2022-01-31 16:20     ` Amir Goldstein
2022-01-31 15:13 ` [PATCH 2/5] vfs: Add tracepoints for inode_excl_inuse_trylock/unlock David Howells
2022-01-31 15:32   ` Amir Goldstein
2022-01-31 15:13 ` David Howells [this message]
2022-01-31 15:14 ` [PATCH 4/5] cachefiles: Use I_EXCL_INUSE instead of S_KERNEL_FILE David Howells
2022-01-31 15:14 ` [PATCH 5/5] cachefiles: Remove the now-unused mark-inode-in-use tracepoints David Howells
2022-01-31 16:28 ` [RFC][PATCH 0/5] vfs, overlayfs, cachefiles: Combine I_OVL_INUSE and S_KERNEL_FILE and split out no-remove Amir Goldstein
2022-02-01 10:44 ` [PATCH 2/5] vfs: Add tracepoints for inode_excl_inuse_trylock/unlock David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=164364203196.1476539.12557909603374068848.stgit@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=hch@infradead.org \
    --cc=jlayton@kernel.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).