All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] ovl: lazy copy up of data on first write
@ 2019-01-18 13:45 Amir Goldstein
  2019-01-18 14:43 ` Vivek Goyal
  0 siblings, 1 reply; 7+ messages in thread
From: Amir Goldstein @ 2019-01-18 13:45 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Vivek Goyal, linux-unionfs

When metacopy feature is enabled, copy up only metadata when
opening a file O_RDWR and defer copy up of data until the first
write operation.

On read operations of lazy copy up file, the lower file is opened
O_RDONLY on every vfs call until the first write operation.

XXX: mmap() of a lazy opened file will map the meta inode, so this
still needs to be fixed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Hi Miklos,

Here's a draft for lazy copy up of data.
Sending it out for your initial feedback.

When opening file for write, we always cache a real file with upper
inode, but that inode may be a meta copy.

This leaves ovl_mmap() doing the wrong thing (mapping the meta copy).

One way do deal with this would be to implement a_ops as you suggested
and copy up data on page write fault.

Do you have another idea how to make use of this RFC patch without
implementing a_aops? Or maybe there is no reason to bother?

Perhaps we could install ovl_a_aop only for files open for write
as first stage? I think that would be easy enough for me to do.
I still don't have the full picture in my head for dealing with
LOWER->LOWER_SHARED transition.

This RFC patch BTW passes unionmount and the overlay/quick xfstests
with metacopy=on/off.

The one test that does fail is the metacopy test overlay/060 because
it fails the expectation to find a data copied up file after open for
write. Changing the test as follows makes it pass:

        # Trigger data copy up and check absence of metacopy xattr.
        mount_overlay $_lowerdir
-       $XFS_IO_PROG -c "open -a $SCRATCH_MNT/$_target"
+       $XFS_IO_PROG -c "falloc 0 $_size" $SCRATCH_MNT/$_target
        echo "check properties of data copied up file"
        check_file_size_contents $SCRATCH_MNT/$_target $_size "$_data"

Thoughts?

Thanks,
Amir.


 fs/overlayfs/copy_up.c   |  5 +---
 fs/overlayfs/file.c      | 63 +++++++++++++++++++++++++++++++++-------
 fs/overlayfs/overlayfs.h | 10 ++++++-
 fs/overlayfs/util.c      |  4 +--
 4 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 48119b23375b..df495f14bce0 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -731,10 +731,7 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
 	if (!S_ISREG(mode))
 		return false;
 
-	if (flags && ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC)))
-		return false;
-
-	return true;
+	return !ovl_open_flags_need_data_copy_up(flags);
 }
 
 /* Copy up data of an inode which was copied up metadata only in the past. */
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 84dd957efa24..e15b16e386f4 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -29,10 +29,15 @@ static struct file *ovl_open_realfile(const struct file *file,
 	struct inode *inode = file_inode(file);
 	struct file *realfile;
 	const struct cred *old_cred;
+	unsigned int flags = file->f_flags | O_NOATIME;
+
+	/* Open lower file O_RDONLY if O_RDWR file data wasn't copied up yet */
+	if (S_ISREG(inode->i_mode) && realinode != ovl_inode_upper(inode))
+		flags &= ~(O_RDWR | O_WRONLY);
 
 	old_cred = ovl_override_creds(inode->i_sb);
-	realfile = open_with_fake_path(&file->f_path, file->f_flags | O_NOATIME,
-				       realinode, current_cred());
+	realfile = open_with_fake_path(&file->f_path, flags, realinode,
+				       current_cred());
 	revert_creds(old_cred);
 
 	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
@@ -80,21 +85,41 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
 	return 0;
 }
 
+/*
+ * If file was opened for write with lazy copy of data, the first operation
+ * to call this function with @rdonly == false will trigger copy up of data.
+ * fdatasync() and fadvise(POSIX_FADV_DONTNEED) are "rdonly" ops in the sense
+ * that they do not dirty pages. If pages are already dirty then file data
+ * has already been copied up and those operations will get the upper file
+ * and write its inode's dirty pages.
+ */
 static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
-			       bool allow_meta)
+			       bool allow_meta, bool rdonly)
 {
+	struct dentry *dentry = file_dentry(file);
 	struct inode *inode = file_inode(file);
 	struct inode *realinode;
+	int err;
 
 	real->flags = 0;
 	real->file = file->private_data;
 
+	if (!rdonly) {
+		/* Maybe copy up data on first write op */
+		err = ovl_open_maybe_copy_up(dentry, file->f_flags);
+		if (err)
+			return err;
+	}
+
 	if (allow_meta)
 		realinode = ovl_inode_real(inode);
 	else
 		realinode = ovl_inode_realdata(inode);
 
-	/* Has it been copied up since we'd opened it? */
+	/*
+	 * Has it been copied up since we'd opened it O_RDONLY?
+	 * or not yet data copied up since we'd opened it O_RDWR?
+	 */
 	if (unlikely(file_inode(real->file) != realinode)) {
 		real->flags = FDPUT_FPUT;
 		real->file = ovl_open_realfile(file, realinode);
@@ -111,23 +136,39 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
 
 static int ovl_real_fdget(const struct file *file, struct fd *real)
 {
-	return ovl_real_fdget_meta(file, real, false);
+	return ovl_real_fdget_meta(file, real, false, false);
+}
+
+static int ovl_real_fdget_rdonly(const struct file *file, struct fd *real)
+{
+	return ovl_real_fdget_meta(file, real, false, true);
 }
 
 static int ovl_open(struct inode *inode, struct file *file)
 {
 	struct dentry *dentry = file_dentry(file);
 	struct file *realfile;
+	int copyup_flags = file->f_flags;
 	int err;
 
-	err = ovl_open_maybe_copy_up(dentry, file->f_flags);
+	/* (O_RDWR | O_WRONLY) signals that we want meta copy up */
+	if ((copyup_flags & O_ACCMODE) == O_RDWR)
+		copyup_flags |= O_WRONLY;
+
+	/* Allow to copy up meta and defer copy up data to first writable op */
+	err = ovl_open_maybe_copy_up(dentry, copyup_flags);
 	if (err)
 		return err;
 
 	/* No longer need these flags, so don't pass them on to underlying fs */
 	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
 
-	realfile = ovl_open_realfile(file, ovl_inode_realdata(inode));
+	/*
+	 * For lazy copy up of data, we cache the upper meta file in
+	 * anticipation for copy up of data on first write. Opening O_RDWR
+	 * and only reading is not going to be efficient in that case.
+	 */
+	realfile = ovl_open_realfile(file, ovl_inode_real(inode));
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 
@@ -201,7 +242,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	if (!iov_iter_count(iter))
 		return 0;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget_rdonly(file, &real);
 	if (ret)
 		return ret;
 
@@ -263,7 +304,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget_meta(file, &real, !datasync);
+	ret = ovl_real_fdget_meta(file, &real, !datasync, true);
 	if (ret)
 		return ret;
 
@@ -339,7 +380,7 @@ static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
 	const struct cred *old_cred;
 	int ret;
 
-	ret = ovl_real_fdget(file, &real);
+	ret = ovl_real_fdget_rdonly(file, &real);
 	if (ret)
 		return ret;
 
@@ -447,7 +488,7 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
 	if (ret)
 		return ret;
 
-	ret = ovl_real_fdget(file_in, &real_in);
+	ret = ovl_real_fdget_rdonly(file_in, &real_in);
 	if (ret) {
 		fdput(real_out);
 		return ret;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 5e45cb3630a0..d3a7d8936f21 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -195,14 +195,22 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
 	return ret;
 }
 
-static inline bool ovl_open_flags_need_copy_up(int flags)
+static inline bool ovl_open_flags_need_data_copy_up(int flags)
 {
 	if (!flags)
 		return false;
 
+	/* Either O_RDWR or O_WRONLY will trigger data copy up */
 	return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
 }
 
+static inline bool ovl_open_flags_need_copy_up(int flags)
+{
+	/* O_RDWR|O_WRONLY together will trigger meta copy up */
+	return (flags & O_ACCMODE) == (O_RDWR | O_WRONLY) ||
+		ovl_open_flags_need_data_copy_up(flags);
+}
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 7c01327b1852..df6d13185f40 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -367,7 +367,7 @@ void ovl_set_upperdata(struct inode *inode)
 /* Caller should hold ovl_inode->lock */
 bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
 {
-	if (!ovl_open_flags_need_copy_up(flags))
+	if (!ovl_open_flags_need_data_copy_up(flags))
 		return false;
 
 	return !ovl_test_flag(OVL_UPPERDATA, d_inode(dentry));
@@ -375,7 +375,7 @@ bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
 
 bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags)
 {
-	if (!ovl_open_flags_need_copy_up(flags))
+	if (!ovl_open_flags_need_data_copy_up(flags))
 		return false;
 
 	return !ovl_has_upperdata(d_inode(dentry));
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] ovl: lazy copy up of data on first write
  2019-01-18 13:45 [RFC][PATCH] ovl: lazy copy up of data on first write Amir Goldstein
@ 2019-01-18 14:43 ` Vivek Goyal
  2019-01-18 15:33   ` Amir Goldstein
  0 siblings, 1 reply; 7+ messages in thread
From: Vivek Goyal @ 2019-01-18 14:43 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, linux-unionfs

On Fri, Jan 18, 2019 at 03:45:19PM +0200, Amir Goldstein wrote:
> When metacopy feature is enabled, copy up only metadata when
> opening a file O_RDWR and defer copy up of data until the first
> write operation.
> 

Amir,

What's the primary use case of lazy data copy up. Are there users who
open file O_RDWR but never write to it. Or we want to transfer latency
of copy up from open to write.

Should this behavior be an opt in with another mount option (and not
be enabled automatically with metacopy=on).

Thanks
Vivek

> On read operations of lazy copy up file, the lower file is opened
> O_RDONLY on every vfs call until the first write operation.
> 
> XXX: mmap() of a lazy opened file will map the meta inode, so this
> still needs to be fixed.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Hi Miklos,
> 
> Here's a draft for lazy copy up of data.
> Sending it out for your initial feedback.
> 
> When opening file for write, we always cache a real file with upper
> inode, but that inode may be a meta copy.
> 
> This leaves ovl_mmap() doing the wrong thing (mapping the meta copy).
> 
> One way do deal with this would be to implement a_ops as you suggested
> and copy up data on page write fault.
> 
> Do you have another idea how to make use of this RFC patch without
> implementing a_aops? Or maybe there is no reason to bother?
> 
> Perhaps we could install ovl_a_aop only for files open for write
> as first stage? I think that would be easy enough for me to do.
> I still don't have the full picture in my head for dealing with
> LOWER->LOWER_SHARED transition.
> 
> This RFC patch BTW passes unionmount and the overlay/quick xfstests
> with metacopy=on/off.
> 
> The one test that does fail is the metacopy test overlay/060 because
> it fails the expectation to find a data copied up file after open for
> write. Changing the test as follows makes it pass:
> 
>         # Trigger data copy up and check absence of metacopy xattr.
>         mount_overlay $_lowerdir
> -       $XFS_IO_PROG -c "open -a $SCRATCH_MNT/$_target"
> +       $XFS_IO_PROG -c "falloc 0 $_size" $SCRATCH_MNT/$_target
>         echo "check properties of data copied up file"
>         check_file_size_contents $SCRATCH_MNT/$_target $_size "$_data"
> 
> Thoughts?
> 
> Thanks,
> Amir.
> 
> 
>  fs/overlayfs/copy_up.c   |  5 +---
>  fs/overlayfs/file.c      | 63 +++++++++++++++++++++++++++++++++-------
>  fs/overlayfs/overlayfs.h | 10 ++++++-
>  fs/overlayfs/util.c      |  4 +--
>  4 files changed, 64 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 48119b23375b..df495f14bce0 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -731,10 +731,7 @@ static bool ovl_need_meta_copy_up(struct dentry *dentry, umode_t mode,
>  	if (!S_ISREG(mode))
>  		return false;
>  
> -	if (flags && ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC)))
> -		return false;
> -
> -	return true;
> +	return !ovl_open_flags_need_data_copy_up(flags);
>  }
>  
>  /* Copy up data of an inode which was copied up metadata only in the past. */
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 84dd957efa24..e15b16e386f4 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -29,10 +29,15 @@ static struct file *ovl_open_realfile(const struct file *file,
>  	struct inode *inode = file_inode(file);
>  	struct file *realfile;
>  	const struct cred *old_cred;
> +	unsigned int flags = file->f_flags | O_NOATIME;
> +
> +	/* Open lower file O_RDONLY if O_RDWR file data wasn't copied up yet */
> +	if (S_ISREG(inode->i_mode) && realinode != ovl_inode_upper(inode))
> +		flags &= ~(O_RDWR | O_WRONLY);
>  
>  	old_cred = ovl_override_creds(inode->i_sb);
> -	realfile = open_with_fake_path(&file->f_path, file->f_flags | O_NOATIME,
> -				       realinode, current_cred());
> +	realfile = open_with_fake_path(&file->f_path, flags, realinode,
> +				       current_cred());
>  	revert_creds(old_cred);
>  
>  	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
> @@ -80,21 +85,41 @@ static int ovl_change_flags(struct file *file, unsigned int flags)
>  	return 0;
>  }
>  
> +/*
> + * If file was opened for write with lazy copy of data, the first operation
> + * to call this function with @rdonly == false will trigger copy up of data.
> + * fdatasync() and fadvise(POSIX_FADV_DONTNEED) are "rdonly" ops in the sense
> + * that they do not dirty pages. If pages are already dirty then file data
> + * has already been copied up and those operations will get the upper file
> + * and write its inode's dirty pages.
> + */
>  static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
> -			       bool allow_meta)
> +			       bool allow_meta, bool rdonly)
>  {
> +	struct dentry *dentry = file_dentry(file);
>  	struct inode *inode = file_inode(file);
>  	struct inode *realinode;
> +	int err;
>  
>  	real->flags = 0;
>  	real->file = file->private_data;
>  
> +	if (!rdonly) {
> +		/* Maybe copy up data on first write op */
> +		err = ovl_open_maybe_copy_up(dentry, file->f_flags);
> +		if (err)
> +			return err;
> +	}
> +
>  	if (allow_meta)
>  		realinode = ovl_inode_real(inode);
>  	else
>  		realinode = ovl_inode_realdata(inode);
>  
> -	/* Has it been copied up since we'd opened it? */
> +	/*
> +	 * Has it been copied up since we'd opened it O_RDONLY?
> +	 * or not yet data copied up since we'd opened it O_RDWR?
> +	 */
>  	if (unlikely(file_inode(real->file) != realinode)) {
>  		real->flags = FDPUT_FPUT;
>  		real->file = ovl_open_realfile(file, realinode);
> @@ -111,23 +136,39 @@ static int ovl_real_fdget_meta(const struct file *file, struct fd *real,
>  
>  static int ovl_real_fdget(const struct file *file, struct fd *real)
>  {
> -	return ovl_real_fdget_meta(file, real, false);
> +	return ovl_real_fdget_meta(file, real, false, false);
> +}
> +
> +static int ovl_real_fdget_rdonly(const struct file *file, struct fd *real)
> +{
> +	return ovl_real_fdget_meta(file, real, false, true);
>  }
>  
>  static int ovl_open(struct inode *inode, struct file *file)
>  {
>  	struct dentry *dentry = file_dentry(file);
>  	struct file *realfile;
> +	int copyup_flags = file->f_flags;
>  	int err;
>  
> -	err = ovl_open_maybe_copy_up(dentry, file->f_flags);
> +	/* (O_RDWR | O_WRONLY) signals that we want meta copy up */
> +	if ((copyup_flags & O_ACCMODE) == O_RDWR)
> +		copyup_flags |= O_WRONLY;
> +
> +	/* Allow to copy up meta and defer copy up data to first writable op */
> +	err = ovl_open_maybe_copy_up(dentry, copyup_flags);
>  	if (err)
>  		return err;
>  
>  	/* No longer need these flags, so don't pass them on to underlying fs */
>  	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
>  
> -	realfile = ovl_open_realfile(file, ovl_inode_realdata(inode));
> +	/*
> +	 * For lazy copy up of data, we cache the upper meta file in
> +	 * anticipation for copy up of data on first write. Opening O_RDWR
> +	 * and only reading is not going to be efficient in that case.
> +	 */
> +	realfile = ovl_open_realfile(file, ovl_inode_real(inode));
>  	if (IS_ERR(realfile))
>  		return PTR_ERR(realfile);
>  
> @@ -201,7 +242,7 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>  	if (!iov_iter_count(iter))
>  		return 0;
>  
> -	ret = ovl_real_fdget(file, &real);
> +	ret = ovl_real_fdget_rdonly(file, &real);
>  	if (ret)
>  		return ret;
>  
> @@ -263,7 +304,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
>  	const struct cred *old_cred;
>  	int ret;
>  
> -	ret = ovl_real_fdget_meta(file, &real, !datasync);
> +	ret = ovl_real_fdget_meta(file, &real, !datasync, true);
>  	if (ret)
>  		return ret;
>  
> @@ -339,7 +380,7 @@ static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
>  	const struct cred *old_cred;
>  	int ret;
>  
> -	ret = ovl_real_fdget(file, &real);
> +	ret = ovl_real_fdget_rdonly(file, &real);
>  	if (ret)
>  		return ret;
>  
> @@ -447,7 +488,7 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
>  	if (ret)
>  		return ret;
>  
> -	ret = ovl_real_fdget(file_in, &real_in);
> +	ret = ovl_real_fdget_rdonly(file_in, &real_in);
>  	if (ret) {
>  		fdput(real_out);
>  		return ret;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 5e45cb3630a0..d3a7d8936f21 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -195,14 +195,22 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
>  	return ret;
>  }
>  
> -static inline bool ovl_open_flags_need_copy_up(int flags)
> +static inline bool ovl_open_flags_need_data_copy_up(int flags)
>  {
>  	if (!flags)
>  		return false;
>  
> +	/* Either O_RDWR or O_WRONLY will trigger data copy up */
>  	return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
>  }
>  
> +static inline bool ovl_open_flags_need_copy_up(int flags)
> +{
> +	/* O_RDWR|O_WRONLY together will trigger meta copy up */
> +	return (flags & O_ACCMODE) == (O_RDWR | O_WRONLY) ||
> +		ovl_open_flags_need_data_copy_up(flags);
> +}
> +
>  /* util.c */
>  int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 7c01327b1852..df6d13185f40 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -367,7 +367,7 @@ void ovl_set_upperdata(struct inode *inode)
>  /* Caller should hold ovl_inode->lock */
>  bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
>  {
> -	if (!ovl_open_flags_need_copy_up(flags))
> +	if (!ovl_open_flags_need_data_copy_up(flags))
>  		return false;
>  
>  	return !ovl_test_flag(OVL_UPPERDATA, d_inode(dentry));
> @@ -375,7 +375,7 @@ bool ovl_dentry_needs_data_copy_up_locked(struct dentry *dentry, int flags)
>  
>  bool ovl_dentry_needs_data_copy_up(struct dentry *dentry, int flags)
>  {
> -	if (!ovl_open_flags_need_copy_up(flags))
> +	if (!ovl_open_flags_need_data_copy_up(flags))
>  		return false;
>  
>  	return !ovl_has_upperdata(d_inode(dentry));
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] ovl: lazy copy up of data on first write
  2019-01-18 14:43 ` Vivek Goyal
@ 2019-01-18 15:33   ` Amir Goldstein
  2019-01-18 18:34     ` Vivek Goyal
  2019-01-18 19:52     ` Vivek Goyal
  0 siblings, 2 replies; 7+ messages in thread
From: Amir Goldstein @ 2019-01-18 15:33 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, overlayfs, Chengguang Xu

On Fri, Jan 18, 2019 at 4:43 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Fri, Jan 18, 2019 at 03:45:19PM +0200, Amir Goldstein wrote:
> > When metacopy feature is enabled, copy up only metadata when
> > opening a file O_RDWR and defer copy up of data until the first
> > write operation.
> >
>
> Amir,
>
> What's the primary use case of lazy data copy up. Are there users who
> open file O_RDWR but never write to it.

I don't know of all the cases, but AFAIK, MSOffice apps (over network share
and maybe also LibreOffice) open the document file O_RDWR just to check
if editing is allowed, but but they re-open the file read-only, because the
document file is never written to. A temp file is written and moved over it.
So copy up of an MSOffice document file data is never beneficial.

> Or we want to transfer latency of copy up from open to write.

Indeed. Chengguang reported that in their use case, the latency at open
time is an issue.

>
> Should this behavior be an opt in with another mount option (and not
> be enabled automatically with metacopy=on).
>

I can't think of one good reason for user to opt-in for copy up data on open.
Or on a use case where latency on open is desired over latency on write.
Can you?

There is no contract with metacopy=on that states when a file is
copied up as meta and when it is copied up as data.
If there was a contract, it would probably state that only data modification
operations SHOULD copy up data and open is not a data modification
operation (it was just the last opportunity to do anything before stacked
file ops).

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] ovl: lazy copy up of data on first write
  2019-01-18 15:33   ` Amir Goldstein
@ 2019-01-18 18:34     ` Vivek Goyal
  2019-01-18 19:57       ` Amir Goldstein
  2019-01-18 19:52     ` Vivek Goyal
  1 sibling, 1 reply; 7+ messages in thread
From: Vivek Goyal @ 2019-01-18 18:34 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, overlayfs, Chengguang Xu

On Fri, Jan 18, 2019 at 05:33:36PM +0200, Amir Goldstein wrote:
> On Fri, Jan 18, 2019 at 4:43 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Fri, Jan 18, 2019 at 03:45:19PM +0200, Amir Goldstein wrote:
> > > When metacopy feature is enabled, copy up only metadata when
> > > opening a file O_RDWR and defer copy up of data until the first
> > > write operation.
> > >
> >
> > Amir,
> >
> > What's the primary use case of lazy data copy up. Are there users who
> > open file O_RDWR but never write to it.
> 
> I don't know of all the cases, but AFAIK, MSOffice apps (over network share
> and maybe also LibreOffice) open the document file O_RDWR just to check
> if editing is allowed, but but they re-open the file read-only, because the
> document file is never written to. A temp file is written and moved over it.
> So copy up of an MSOffice document file data is never beneficial.
> 
> > Or we want to transfer latency of copy up from open to write.
> 
> Indeed. Chengguang reported that in their use case, the latency at open
> time is an issue.
> 
> >
> > Should this behavior be an opt in with another mount option (and not
> > be enabled automatically with metacopy=on).
> >
> 
> I can't think of one good reason for user to opt-in for copy up data on open.
> Or on a use case where latency on open is desired over latency on write.
> Can you?

I can't. I am wondering why are we changing default. Why is it a better
default to do copy up on first WRITE and not during open time. In his
emails, Chengguang, just said their applications will benefit from it
without giving further details. So if their application is not
writing, they could easily solve this problem by opening file
O_RDONLY as well?

Or is it the case that they don't know in advance whether file will
actually be written or not. So they open O_RDWR and after
that it could be written or not written?

Thanks
Vivek

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] ovl: lazy copy up of data on first write
  2019-01-18 15:33   ` Amir Goldstein
  2019-01-18 18:34     ` Vivek Goyal
@ 2019-01-18 19:52     ` Vivek Goyal
  2019-01-18 20:12       ` Amir Goldstein
  1 sibling, 1 reply; 7+ messages in thread
From: Vivek Goyal @ 2019-01-18 19:52 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, overlayfs, Chengguang Xu

On Fri, Jan 18, 2019 at 05:33:36PM +0200, Amir Goldstein wrote:
> On Fri, Jan 18, 2019 at 4:43 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Fri, Jan 18, 2019 at 03:45:19PM +0200, Amir Goldstein wrote:
> > > When metacopy feature is enabled, copy up only metadata when
> > > opening a file O_RDWR and defer copy up of data until the first
> > > write operation.
> > >
> >
> > Amir,
> >
> > What's the primary use case of lazy data copy up. Are there users who
> > open file O_RDWR but never write to it.
> 
> I don't know of all the cases, but AFAIK, MSOffice apps (over network share
> and maybe also LibreOffice) open the document file O_RDWR just to check
> if editing is allowed, but but they re-open the file read-only, because the
> document file is never written to. A temp file is written and moved over it.
> So copy up of an MSOffice document file data is never beneficial.
> 
> > Or we want to transfer latency of copy up from open to write.
> 
> Indeed. Chengguang reported that in their use case, the latency at open
> time is an issue.
> 
> >
> > Should this behavior be an opt in with another mount option (and not
> > be enabled automatically with metacopy=on).
> >
> 
> I can't think of one good reason for user to opt-in for copy up data on open.
> Or on a use case where latency on open is desired over latency on write.
> Can you?

IIUC, now if I open a file O_RDWR and only issue reads to file and never
issue writes, then for every read, I will open a lower file, finish read
and close fd. This is slower path. What's the performance penalty? If
this is significant, then it might make sense to opt-in for lazy copy
up behavior (instead of relying that application will open file O_RDONLY
if they never issue writes to it).

Thanks
Vivek

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] ovl: lazy copy up of data on first write
  2019-01-18 18:34     ` Vivek Goyal
@ 2019-01-18 19:57       ` Amir Goldstein
  0 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2019-01-18 19:57 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, overlayfs, Chengguang Xu

On Fri, Jan 18, 2019 at 8:34 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Fri, Jan 18, 2019 at 05:33:36PM +0200, Amir Goldstein wrote:
> > On Fri, Jan 18, 2019 at 4:43 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > On Fri, Jan 18, 2019 at 03:45:19PM +0200, Amir Goldstein wrote:
> > > > When metacopy feature is enabled, copy up only metadata when
> > > > opening a file O_RDWR and defer copy up of data until the first
> > > > write operation.
> > > >
> > >
> > > Amir,
> > >
> > > What's the primary use case of lazy data copy up. Are there users who
> > > open file O_RDWR but never write to it.
> >
> > I don't know of all the cases, but AFAIK, MSOffice apps (over network share
> > and maybe also LibreOffice) open the document file O_RDWR just to check
> > if editing is allowed, but but they re-open the file read-only, because the
> > document file is never written to. A temp file is written and moved over it.
> > So copy up of an MSOffice document file data is never beneficial.
> >
> > > Or we want to transfer latency of copy up from open to write.
> >
> > Indeed. Chengguang reported that in their use case, the latency at open
> > time is an issue.
> >
> > >
> > > Should this behavior be an opt in with another mount option (and not
> > > be enabled automatically with metacopy=on).
> > >
> >
> > I can't think of one good reason for user to opt-in for copy up data on open.
> > Or on a use case where latency on open is desired over latency on write.
> > Can you?
>
> I can't. I am wondering why are we changing default. Why is it a better
> default to do copy up on first WRITE and not during open time. In his

Oh there are very good reasons to prefer latency on WRITE over latency
on open.

One reason is that traditionally, open is not an operation where long latency
are expected, so apps may behave badly with long latency on open.

Another reason, if application is latency sensitive, it can use aio,
but there is no async open, or is there??

> emails, Chengguang, just said their applications will benefit from it
> without giving further details. So if their application is not
> writing, they could easily solve this problem by opening file
> O_RDONLY as well?

My understanding is that latency on open() is an issue and latency on
first write is not an issue, which is something I am familiar with from
other apps, but I will let Chengguang elaborate.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] ovl: lazy copy up of data on first write
  2019-01-18 19:52     ` Vivek Goyal
@ 2019-01-18 20:12       ` Amir Goldstein
  0 siblings, 0 replies; 7+ messages in thread
From: Amir Goldstein @ 2019-01-18 20:12 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: Miklos Szeredi, overlayfs, Chengguang Xu

> > >
> > > Should this behavior be an opt in with another mount option (and not
> > > be enabled automatically with metacopy=on).
> > >
> >
> > I can't think of one good reason for user to opt-in for copy up data on open.
> > Or on a use case where latency on open is desired over latency on write.
> > Can you?
>
> IIUC, now if I open a file O_RDWR and only issue reads to file and never
> issue writes, then for every read, I will open a lower file, finish read
> and close fd. This is slower path. What's the performance penalty? If

That is correct, but as you rightfully noted, that is probably a corner case
and not the common behavior.
It is not something that we cannot fix if we find that it is important.
We can choose to cache the lower file and we can also choose to cache
both upper and lower real files.

> this is significant, then it might make sense to opt-in for lazy copy
> up behavior (instead of relying that application will open file O_RDONLY
> if they never issue writes to it).
>

Maybe. My instinct is that this case where complication the user with
more configurations is not worth it and we can find a good enough
solution for all cases.

Let's see what Miklos has to say.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-18 20:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-18 13:45 [RFC][PATCH] ovl: lazy copy up of data on first write Amir Goldstein
2019-01-18 14:43 ` Vivek Goyal
2019-01-18 15:33   ` Amir Goldstein
2019-01-18 18:34     ` Vivek Goyal
2019-01-18 19:57       ` Amir Goldstein
2019-01-18 19:52     ` Vivek Goyal
2019-01-18 20:12       ` Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.