linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vfs: don't unnecessarily clone write access for writable fds
@ 2020-06-11  1:49 Eric Biggers
  2020-06-11  2:30 ` Eric Biggers
  0 siblings, 1 reply; 2+ messages in thread
From: Eric Biggers @ 2020-06-11  1:49 UTC (permalink / raw)
  To: linux-fsdevel, Alexander Viro; +Cc: Daeho Jeong, linux-kernel

From: Eric Biggers <ebiggers@google.com>

There's no need for mnt_want_write_file() to clone a write reference to
the mount when the file is already open for writing, provided that
mnt_drop_write_file() is changed to conditionally drop the reference.

We seem to have ended up in the current situation because
mnt_want_write_file() used to be paired with mnt_drop_write(), due to
mnt_drop_write_file() not having been added yet.  So originally
mnt_want_write_file() did have to always take a reference.

But later mnt_drop_write_file() was added, and all callers of
mnt_want_write_file() were paired with it.  This makes the compatibility
between mnt_want_write_file() and mnt_drop_write() no longer necessary.

Therefore, make __mnt_want_write_file() and __mnt_drop_write_file() be
no-ops on files already open for writing.  This removes the only caller
of mnt_clone_write(), so remove that too.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/namespace.c        | 43 ++++++++++---------------------------------
 include/linux/mount.h |  1 -
 2 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 7cd64240916573..7e78c7ae4ab34d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -359,51 +359,27 @@ int mnt_want_write(struct vfsmount *m)
 }
 EXPORT_SYMBOL_GPL(mnt_want_write);
 
-/**
- * mnt_clone_write - get write access to a mount
- * @mnt: the mount on which to take a write
- *
- * This is effectively like mnt_want_write, except
- * it must only be used to take an extra write reference
- * on a mountpoint that we already know has a write reference
- * on it. This allows some optimisation.
- *
- * After finished, mnt_drop_write must be called as usual to
- * drop the reference.
- */
-int mnt_clone_write(struct vfsmount *mnt)
-{
-	/* superblock may be r/o */
-	if (__mnt_is_readonly(mnt))
-		return -EROFS;
-	preempt_disable();
-	mnt_inc_writers(real_mount(mnt));
-	preempt_enable();
-	return 0;
-}
-EXPORT_SYMBOL_GPL(mnt_clone_write);
-
 /**
  * __mnt_want_write_file - get write access to a file's mount
  * @file: the file who's mount on which to take a write
  *
- * This is like __mnt_want_write, but it takes a file and can
- * do some optimisations if the file is open for write already
+ * This is like __mnt_want_write, but it does nothing if the file is already
+ * open for writing.  This must be paired with __mnt_drop_write_file.
  */
 int __mnt_want_write_file(struct file *file)
 {
-	if (!(file->f_mode & FMODE_WRITER))
-		return __mnt_want_write(file->f_path.mnt);
-	else
-		return mnt_clone_write(file->f_path.mnt);
+	if (file->f_mode & FMODE_WRITER)
+		return 0;
+	return __mnt_want_write(file->f_path.mnt);
 }
 
 /**
  * mnt_want_write_file - get write access to a file's mount
  * @file: the file who's mount on which to take a write
  *
- * This is like mnt_want_write, but it takes a file and can
- * do some optimisations if the file is open for write already
+ * This is like mnt_want_write, but it skips getting write access to the mount
+ * if the file is already open for writing.  The freeze protection is still
+ * done.  This must be paired with mnt_drop_write_file.
  */
 int mnt_want_write_file(struct file *file)
 {
@@ -449,7 +425,8 @@ EXPORT_SYMBOL_GPL(mnt_drop_write);
 
 void __mnt_drop_write_file(struct file *file)
 {
-	__mnt_drop_write(file->f_path.mnt);
+	if (!(file->f_mode & FMODE_WRITER))
+		__mnt_drop_write(file->f_path.mnt);
 }
 
 void mnt_drop_write_file(struct file *file)
diff --git a/include/linux/mount.h b/include/linux/mount.h
index de657bd211fa64..29d216f927c28c 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -78,7 +78,6 @@ struct path;
 
 extern int mnt_want_write(struct vfsmount *mnt);
 extern int mnt_want_write_file(struct file *file);
-extern int mnt_clone_write(struct vfsmount *mnt);
 extern void mnt_drop_write(struct vfsmount *mnt);
 extern void mnt_drop_write_file(struct file *file);
 extern void mntput(struct vfsmount *mnt);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] vfs: don't unnecessarily clone write access for writable fds
  2020-06-11  1:49 [PATCH] vfs: don't unnecessarily clone write access for writable fds Eric Biggers
@ 2020-06-11  2:30 ` Eric Biggers
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Biggers @ 2020-06-11  2:30 UTC (permalink / raw)
  To: linux-fsdevel, Alexander Viro; +Cc: Daeho Jeong, linux-kernel

On Wed, Jun 10, 2020 at 06:49:45PM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> There's no need for mnt_want_write_file() to clone a write reference to
> the mount when the file is already open for writing, provided that
> mnt_drop_write_file() is changed to conditionally drop the reference.
> 
> We seem to have ended up in the current situation because
> mnt_want_write_file() used to be paired with mnt_drop_write(), due to
> mnt_drop_write_file() not having been added yet.  So originally
> mnt_want_write_file() did have to always take a reference.
> 
> But later mnt_drop_write_file() was added, and all callers of
> mnt_want_write_file() were paired with it.  This makes the compatibility
> between mnt_want_write_file() and mnt_drop_write() no longer necessary.
> 
> Therefore, make __mnt_want_write_file() and __mnt_drop_write_file() be
> no-ops on files already open for writing.  This removes the only caller
> of mnt_clone_write(), so remove that too.
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  fs/namespace.c        | 43 ++++++++++---------------------------------
>  include/linux/mount.h |  1 -
>  2 files changed, 10 insertions(+), 34 deletions(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 7cd64240916573..7e78c7ae4ab34d 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -359,51 +359,27 @@ int mnt_want_write(struct vfsmount *m)
>  }
>  EXPORT_SYMBOL_GPL(mnt_want_write);
>  
> -/**
> - * mnt_clone_write - get write access to a mount
> - * @mnt: the mount on which to take a write
> - *
> - * This is effectively like mnt_want_write, except
> - * it must only be used to take an extra write reference
> - * on a mountpoint that we already know has a write reference
> - * on it. This allows some optimisation.
> - *
> - * After finished, mnt_drop_write must be called as usual to
> - * drop the reference.
> - */
> -int mnt_clone_write(struct vfsmount *mnt)
> -{
> -	/* superblock may be r/o */
> -	if (__mnt_is_readonly(mnt))
> -		return -EROFS;
> -	preempt_disable();
> -	mnt_inc_writers(real_mount(mnt));
> -	preempt_enable();
> -	return 0;
> -}
> -EXPORT_SYMBOL_GPL(mnt_clone_write);

Sorry, I think I missed something -- the __mnt_is_readonly() check should be
kept because there are cases where SB_RDONLY can be set when there are writable
file descriptors.  For example, ext4 with errors=remount-ro.

Interestingly though, sys_write() skips that check because it uses
file_start_write() which only does the freeze protection.

- Eric

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-06-11  2:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-11  1:49 [PATCH] vfs: don't unnecessarily clone write access for writable fds Eric Biggers
2020-06-11  2:30 ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).