* [PATCH v1 0/5] Enable holes on copy_file_range()
@ 2018-05-08 21:24 Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-08 21:24 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, smfrench, linux-unionfs, david
If copy_file_range performs a copy using splice, it converts holes
to zeros. This effort primarily changes this behavior to create
holes when it is possible.
Even if copy_file_range() or clone_file_range() does not work for different
mounted filesystems, We should be able to splice files if they do not
belong the same super_block.
Changes since v0:
- [Amir] Carved out do_copy_file_range() which can be used by overlayfs
--
Goldwyn
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range
2018-05-08 21:24 [PATCH v1 0/5] Enable holes on copy_file_range() Goldwyn Rodrigues
@ 2018-05-08 21:24 ` Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same Goldwyn Rodrigues
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-08 21:24 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
Preparatory patch to carve out do_copy_file_range() from
vfs_copy_file_range
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/read_write.c | 67 ++++++++++++++++++++++++++++++++-------------------------
1 file changed, 38 insertions(+), 29 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index c4eabbfc90df..7d9dfb62ba7d 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1541,6 +1541,42 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
}
#endif
+ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ size_t len, unsigned int flags,
+ unsigned int splice_flags)
+{
+ ssize_t ret = 0;
+
+ if (flags != 0)
+ return -EINVAL;
+
+ if (len == 0)
+ return 0;
+
+ /*
+ * Try cloning first, this is supported by more file systems, and
+ * more efficient if both clone and copy are supported (e.g. NFS).
+ */
+ if (file_in->f_op->clone_file_range) {
+ ret = file_in->f_op->clone_file_range(file_in, pos_in,
+ file_out, pos_out, len);
+ if (ret == 0)
+ return len;
+ }
+
+ if (file_out->f_op->copy_file_range) {
+ ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
+ pos_out, len, flags);
+ if (ret != -EOPNOTSUPP)
+ return ret;
+ }
+
+ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
+ len > MAX_RW_COUNT ? MAX_RW_COUNT : len, splice_flags);
+ return ret;
+}
+
/*
* copy_file_range() differs from regular file read and write in that it
* specifically allows return partial success. When it does so is up to
@@ -1554,9 +1590,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
struct inode *inode_out = file_inode(file_out);
ssize_t ret;
- if (flags != 0)
- return -EINVAL;
-
if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
return -EISDIR;
if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
@@ -1579,35 +1612,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (inode_in->i_sb != inode_out->i_sb)
return -EXDEV;
- if (len == 0)
- return 0;
-
file_start_write(file_out);
- /*
- * Try cloning first, this is supported by more file systems, and
- * more efficient if both clone and copy are supported (e.g. NFS).
- */
- if (file_in->f_op->clone_file_range) {
- ret = file_in->f_op->clone_file_range(file_in, pos_in,
- file_out, pos_out, len);
- if (ret == 0) {
- ret = len;
- goto done;
- }
- }
-
- if (file_out->f_op->copy_file_range) {
- ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
- pos_out, len, flags);
- if (ret != -EOPNOTSUPP)
- goto done;
- }
-
- ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
- len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ ret = do_copy_file_range(file_in, pos_in,
+ file_out, pos_out, len, flags, 0);
-done:
if (ret > 0) {
fsnotify_access(file_in);
add_rchar(current, ret);
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same
2018-05-08 21:24 [PATCH v1 0/5] Enable holes on copy_file_range() Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
@ 2018-05-08 21:24 ` Goldwyn Rodrigues
2018-05-08 21:57 ` Florian Weimer
2018-05-09 5:44 ` Amir Goldstein
2018-05-08 21:24 ` [PATCH 3/4] copy_file_range: splice with holes Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data() Goldwyn Rodrigues
3 siblings, 2 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-08 21:24 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
While performing copy_file_range(), if superblocks of file_in and
file_out don't match, instead of returning -EXDEV, perform
splice for a faster copy.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/read_write.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 7d9dfb62ba7d..2c9e7a5ea806 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1546,6 +1546,8 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
size_t len, unsigned int flags,
unsigned int splice_flags)
{
+ struct inode *inode_in = file_inode(file_in);
+ struct inode *inode_out = file_inode(file_out);
ssize_t ret = 0;
if (flags != 0)
@@ -1554,6 +1556,9 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
if (len == 0)
return 0;
+ if (inode_in->i_sb != inode_out->i_sb)
+ goto splice;
+
/*
* Try cloning first, this is supported by more file systems, and
* more efficient if both clone and copy are supported (e.g. NFS).
@@ -1571,7 +1576,7 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
if (ret != -EOPNOTSUPP)
return ret;
}
-
+splice:
ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
len > MAX_RW_COUNT ? MAX_RW_COUNT : len, splice_flags);
return ret;
@@ -1608,10 +1613,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
(file_out->f_flags & O_APPEND))
return -EBADF;
- /* this could be relaxed once a method supports cross-fs copies */
- if (inode_in->i_sb != inode_out->i_sb)
- return -EXDEV;
-
file_start_write(file_out);
ret = do_copy_file_range(file_in, pos_in,
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/4] copy_file_range: splice with holes
2018-05-08 21:24 [PATCH v1 0/5] Enable holes on copy_file_range() Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same Goldwyn Rodrigues
@ 2018-05-08 21:24 ` Goldwyn Rodrigues
2018-05-08 21:59 ` Florian Weimer
2018-05-09 5:47 ` Amir Goldstein
2018-05-08 21:24 ` [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data() Goldwyn Rodrigues
3 siblings, 2 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-08 21:24 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
copy_file_range calls do_splice_direct() if fs->clone_file_range
or fs->copy_file_range() is not available. However, do_splice_direct()
converts holes to zeros. Detect holes in the file_in range, and
create them in the corresponding file_out range.
If there is already data present at the offset in file_out, attempt
to punch a hole there. If the operation is not supported, fall
back to performing splice on the whole range.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/read_write.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 55 insertions(+), 5 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 2c9e7a5ea806..5df9d6e8ebee 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -20,6 +20,7 @@
#include <linux/compat.h>
#include <linux/mount.h>
#include <linux/fs.h>
+#include <linux/falloc.h>
#include "internal.h"
#include <linux/uaccess.h>
@@ -1541,14 +1542,15 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
}
#endif
-ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t len, unsigned int flags,
unsigned int splice_flags)
{
struct inode *inode_in = file_inode(file_in);
struct inode *inode_out = file_inode(file_out);
- ssize_t ret = 0;
+ ssize_t ret = 0, total = 0;
+ loff_t size, end;
if (flags != 0)
return -EINVAL;
@@ -1576,10 +1578,58 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
if (ret != -EOPNOTSUPP)
return ret;
}
+
splice:
- ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
- len > MAX_RW_COUNT ? MAX_RW_COUNT : len, splice_flags);
- return ret;
+ while (total < len) {
+ end = vfs_llseek(file_in, pos_in, SEEK_HOLE);
+
+ /* Starting position is already in a hole */
+ if (end == pos_in)
+ goto hole;
+ size = end - pos_in;
+do_splice:
+ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
+ size, 0);
+ if (ret < 0)
+ goto out;
+ total += ret;
+ if (total == len)
+ break;
+hole:
+ end = vfs_llseek(file_in, pos_in, SEEK_DATA);
+ if (end < 0) {
+ ret = end;
+ goto out;
+ }
+ size = end - pos_in;
+ /* Data on offset, punch holes */
+ if (i_size_read(file_out->f_inode) > pos_out) {
+ ret = vfs_fallocate(file_out,
+ FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+ pos_out, size);
+ if (ret < 0) {
+ /*
+ * The filesystem does not support punching
+ * holes. Perform splice on the remaining range.
+ */
+ if (ret == -EOPNOTSUPP) {
+ size = len - total;
+ goto do_splice;
+ }
+ goto out;
+ }
+ }
+ if (ret < 0) {
+ ret = end;
+ goto out;
+ }
+ pos_out += size;
+ pos_in = end;
+ total += size;
+ }
+
+out:
+ return total ? total : ret;
}
/*
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data()
2018-05-08 21:24 [PATCH v1 0/5] Enable holes on copy_file_range() Goldwyn Rodrigues
` (2 preceding siblings ...)
2018-05-08 21:24 ` [PATCH 3/4] copy_file_range: splice with holes Goldwyn Rodrigues
@ 2018-05-08 21:24 ` Goldwyn Rodrigues
2018-05-09 5:50 ` Amir Goldstein
3 siblings, 1 reply; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-08 21:24 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
This will preserve the holes and will clone(), if available.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/overlayfs/copy_up.c | 28 ++++++++--------------------
fs/read_write.c | 3 ++-
include/linux/fs.h | 3 +++
3 files changed, 13 insertions(+), 21 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 8bede0742619..1f89380873ce 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -138,8 +138,7 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
{
struct file *old_file;
struct file *new_file;
- loff_t old_pos = 0;
- loff_t new_pos = 0;
+ loff_t pos = 0;
int error = 0;
if (len == 0)
@@ -155,38 +154,27 @@ static int ovl_copy_up_data(struct path *old, struct path *new, loff_t len)
goto out_fput;
}
- /* Try to use clone_file_range to clone up within the same fs */
- error = vfs_clone_file_range(old_file, 0, new_file, 0, len);
- if (!error)
- goto out;
- /* Couldn't clone, so now we try to copy the data */
- error = 0;
-
- /* FIXME: copy up sparse files efficiently */
- while (len) {
+ while (pos < len) {
size_t this_len = OVL_COPY_UP_CHUNK_SIZE;
long bytes;
- if (len < this_len)
- this_len = len;
+ if (len - pos < this_len)
+ this_len = len - pos;
if (signal_pending_state(TASK_KILLABLE, current)) {
error = -EINTR;
break;
}
- bytes = do_splice_direct(old_file, &old_pos,
- new_file, &new_pos,
- this_len, SPLICE_F_MOVE);
+ bytes = do_copy_file_range(old_file, pos,
+ new_file, pos,
+ this_len, 0, SPLICE_F_MOVE);
if (bytes <= 0) {
error = bytes;
break;
}
- WARN_ON(old_pos != new_pos);
-
- len -= bytes;
+ pos += bytes;
}
-out:
if (!error)
error = vfs_fsync(new_file, 0);
fput(new_file);
diff --git a/fs/read_write.c b/fs/read_write.c
index 5df9d6e8ebee..57b5b74c982a 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1542,7 +1542,7 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
}
#endif
-static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t len, unsigned int flags,
unsigned int splice_flags)
@@ -1631,6 +1631,7 @@ static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
out:
return total ? total : ret;
}
+EXPORT_SYMBOL(do_copy_file_range);
/*
* copy_file_range() differs from regular file read and write in that it
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 760d8da1b6c7..d5349b17fa10 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1799,6 +1799,9 @@ extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
unsigned long, loff_t *, rwf_t);
+extern ssize_t do_copy_file_range(struct file *, loff_t , struct file *,
+ loff_t , size_t, unsigned int,
+ unsigned int);
extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
loff_t, size_t, unsigned int);
extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same
2018-05-08 21:24 ` [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same Goldwyn Rodrigues
@ 2018-05-08 21:57 ` Florian Weimer
2018-05-09 19:08 ` Goldwyn Rodrigues
2018-05-09 5:44 ` Amir Goldstein
1 sibling, 1 reply; 17+ messages in thread
From: Florian Weimer @ 2018-05-08 21:57 UTC (permalink / raw)
To: Goldwyn Rodrigues, linux-fsdevel
Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
On 05/08/2018 11:24 PM, Goldwyn Rodrigues wrote:
> While performing copy_file_range(), if superblocks of file_in and
> file_out don't match, instead of returning -EXDEV, perform
> splice for a faster copy.
We have a userspace emulation in glibc which used to be quite faithful,
including the EXDEV error (which is not strictly necessary to produce).
Should we change glibc to perform a userspace copy if the system call
returns EXDEV due to an older kernel?
Thanks,
Florian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] copy_file_range: splice with holes
2018-05-08 21:24 ` [PATCH 3/4] copy_file_range: splice with holes Goldwyn Rodrigues
@ 2018-05-08 21:59 ` Florian Weimer
2018-05-08 23:43 ` Dave Chinner
2018-05-09 5:47 ` Amir Goldstein
1 sibling, 1 reply; 17+ messages in thread
From: Florian Weimer @ 2018-05-08 21:59 UTC (permalink / raw)
To: Goldwyn Rodrigues, linux-fsdevel
Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
On 05/08/2018 11:24 PM, Goldwyn Rodrigues wrote:
> copy_file_range calls do_splice_direct() if fs->clone_file_range
> or fs->copy_file_range() is not available. However, do_splice_direct()
> converts holes to zeros. Detect holes in the file_in range, and
> create them in the corresponding file_out range.
Should this be restricted to calls to copy_file_range with a
(yet-to-be-defined) flag?
I was under the impression that the hole-filling nature of
copy_file_range was not a bug.
Thanks,
Florian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] copy_file_range: splice with holes
2018-05-08 21:59 ` Florian Weimer
@ 2018-05-08 23:43 ` Dave Chinner
0 siblings, 0 replies; 17+ messages in thread
From: Dave Chinner @ 2018-05-08 23:43 UTC (permalink / raw)
To: Florian Weimer
Cc: Goldwyn Rodrigues, linux-fsdevel, hch, smfrench, linux-unionfs,
Goldwyn Rodrigues
On Tue, May 08, 2018 at 11:59:45PM +0200, Florian Weimer wrote:
> On 05/08/2018 11:24 PM, Goldwyn Rodrigues wrote:
> >copy_file_range calls do_splice_direct() if fs->clone_file_range
> >or fs->copy_file_range() is not available. However, do_splice_direct()
> >converts holes to zeros. Detect holes in the file_in range, and
> >create them in the corresponding file_out range.
>
> Should this be restricted to calls to copy_file_range with a
> (yet-to-be-defined) flag?
>
> I was under the impression that the hole-filling nature of
> copy_file_range was not a bug.
It's not a bug, but it's currently undefined behaviour. i.e. some
filesystem implementations will hole fill while others won't, and
you cannot predict ahead of time what will happen.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same
2018-05-08 21:24 ` [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same Goldwyn Rodrigues
2018-05-08 21:57 ` Florian Weimer
@ 2018-05-09 5:44 ` Amir Goldstein
1 sibling, 0 replies; 17+ messages in thread
From: Amir Goldstein @ 2018-05-09 5:44 UTC (permalink / raw)
To: Goldwyn Rodrigues
Cc: linux-fsdevel, Christoph Hellwig, Steve French, overlayfs,
Dave Chinner, Goldwyn Rodrigues
On Wed, May 9, 2018 at 12:24 AM, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> While performing copy_file_range(), if superblocks of file_in and
> file_out don't match, instead of returning -EXDEV, perform
> splice for a faster copy.
>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> ---
> fs/read_write.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 7d9dfb62ba7d..2c9e7a5ea806 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1546,6 +1546,8 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> size_t len, unsigned int flags,
> unsigned int splice_flags)
> {
> + struct inode *inode_in = file_inode(file_in);
> + struct inode *inode_out = file_inode(file_out);
> ssize_t ret = 0;
>
> if (flags != 0)
> @@ -1554,6 +1556,9 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> if (len == 0)
> return 0;
>
> + if (inode_in->i_sb != inode_out->i_sb)
> + goto splice;
> +
> /*
> * Try cloning first, this is supported by more file systems, and
> * more efficient if both clone and copy are supported (e.g. NFS).
> @@ -1571,7 +1576,7 @@ ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> if (ret != -EOPNOTSUPP)
> return ret;
> }
> -
> +splice:
> ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> len > MAX_RW_COUNT ? MAX_RW_COUNT : len, splice_flags);
> return ret;
> @@ -1608,10 +1613,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> (file_out->f_flags & O_APPEND))
> return -EBADF;
>
> - /* this could be relaxed once a method supports cross-fs copies */
> - if (inode_in->i_sb != inode_out->i_sb)
> - return -EXDEV;
> -
> file_start_write(file_out);
>
> ret = do_copy_file_range(file_in, pos_in,
> --
> 2.16.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] copy_file_range: splice with holes
2018-05-08 21:24 ` [PATCH 3/4] copy_file_range: splice with holes Goldwyn Rodrigues
2018-05-08 21:59 ` Florian Weimer
@ 2018-05-09 5:47 ` Amir Goldstein
1 sibling, 0 replies; 17+ messages in thread
From: Amir Goldstein @ 2018-05-09 5:47 UTC (permalink / raw)
To: Goldwyn Rodrigues
Cc: linux-fsdevel, Christoph Hellwig, Steve French, overlayfs,
Dave Chinner, Goldwyn Rodrigues
On Wed, May 9, 2018 at 12:24 AM, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> copy_file_range calls do_splice_direct() if fs->clone_file_range
> or fs->copy_file_range() is not available. However, do_splice_direct()
> converts holes to zeros. Detect holes in the file_in range, and
> create them in the corresponding file_out range.
>
> If there is already data present at the offset in file_out, attempt
> to punch a hole there. If the operation is not supported, fall
> back to performing splice on the whole range.
>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
but..
> ---
> fs/read_write.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 55 insertions(+), 5 deletions(-)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 2c9e7a5ea806..5df9d6e8ebee 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -20,6 +20,7 @@
> #include <linux/compat.h>
> #include <linux/mount.h>
> #include <linux/fs.h>
> +#include <linux/falloc.h>
> #include "internal.h"
>
> #include <linux/uaccess.h>
> @@ -1541,14 +1542,15 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
> }
> #endif
>
> -ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> +static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
Either make it static to begin with or non static and exported to begin with.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data()
2018-05-08 21:24 ` [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data() Goldwyn Rodrigues
@ 2018-05-09 5:50 ` Amir Goldstein
2018-05-09 19:13 ` Goldwyn Rodrigues
0 siblings, 1 reply; 17+ messages in thread
From: Amir Goldstein @ 2018-05-09 5:50 UTC (permalink / raw)
To: Goldwyn Rodrigues
Cc: linux-fsdevel, Christoph Hellwig, Steve French, overlayfs,
Dave Chinner, Goldwyn Rodrigues
On Wed, May 9, 2018 at 12:24 AM, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> This will preserve the holes and will clone(), if available.
>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Only please mention in commit message that it changes behavoir
slightly for a very large file (clone in chunks).
I see no problem with this change.
And please test with xfstest overlay/001 with copies up a large
sparse file. test time should drop from ~30s to 0s.
If you like I can test that one for you.
I believe there are also generic copy_file_range tests in xfstests.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same
2018-05-08 21:57 ` Florian Weimer
@ 2018-05-09 19:08 ` Goldwyn Rodrigues
0 siblings, 0 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-09 19:08 UTC (permalink / raw)
To: Florian Weimer, linux-fsdevel
Cc: hch, smfrench, linux-unionfs, david, Goldwyn Rodrigues
On 05/08/2018 04:57 PM, Florian Weimer wrote:
> On 05/08/2018 11:24 PM, Goldwyn Rodrigues wrote:
>> While performing copy_file_range(), if superblocks of file_in and
>> file_out don't match, instead of returning -EXDEV, perform
>> splice for a faster copy.
>
> We have a userspace emulation in glibc which used to be quite faithful,
> including the EXDEV error (which is not strictly necessary to produce).
>
> Should we change glibc to perform a userspace copy if the system call
> returns EXDEV due to an older kernel?
>
I don't seen any purpose. The user would anyways have to perform a copy
if it receives -EXDEV.
--
Goldwyn
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data()
2018-05-09 5:50 ` Amir Goldstein
@ 2018-05-09 19:13 ` Goldwyn Rodrigues
2018-05-10 4:52 ` Amir Goldstein
0 siblings, 1 reply; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-09 19:13 UTC (permalink / raw)
To: Amir Goldstein
Cc: linux-fsdevel, Christoph Hellwig, Steve French, overlayfs,
Dave Chinner, Goldwyn Rodrigues
On 05/09/2018 12:50 AM, Amir Goldstein wrote:
> On Wed, May 9, 2018 at 12:24 AM, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
>> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>>
>> This will preserve the holes and will clone(), if available.
>>
>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>
> Only please mention in commit message that it changes behavoir
> slightly for a very large file (clone in chunks).
Change behavior? Only it will have holes. It will still respect length.
Actually, I found a bug when it would not respect length if offset is
father than length which I have fixed.
> I see no problem with this change.
>
> And please test with xfstest overlay/001 with copies up a large
> sparse file. test time should drop from ~30s to 0s.
Yup, it passes in 1s on my VM :)
> If you like I can test that one for you.
> I believe there are also generic copy_file_range tests in xfstests.
>
Thanks for the review
--
Goldwyn
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data()
2018-05-09 19:13 ` Goldwyn Rodrigues
@ 2018-05-10 4:52 ` Amir Goldstein
0 siblings, 0 replies; 17+ messages in thread
From: Amir Goldstein @ 2018-05-10 4:52 UTC (permalink / raw)
To: Goldwyn Rodrigues
Cc: linux-fsdevel, Christoph Hellwig, Steve French, overlayfs,
Dave Chinner, Goldwyn Rodrigues
On Wed, May 9, 2018 at 10:13 PM, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
>
>
> On 05/09/2018 12:50 AM, Amir Goldstein wrote:
>> On Wed, May 9, 2018 at 12:24 AM, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
>>> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>>>
>>> This will preserve the holes and will clone(), if available.
>>>
>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>
>> Only please mention in commit message that it changes behavoir
>> slightly for a very large file (clone in chunks).
>
> Change behavior? Only it will have holes. It will still respect length.
> Actually, I found a bug when it would not respect length if offset is
> father than length which I have fixed.
What I meant is the change of behavior for when underlying fs supports
clone.
Your patch changes the behavior for a very large file from single call
to vfs_clone_file_range() on entire length to several calls in a loop.
Nevermind. It's too insignificant for anyone to care.
If overlayfs ever supports NFS as upper layer, we may want to rethink
this.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range
2018-06-14 15:12 [PATCH RESEND v3 0/4] Enable holes in copy_file_range() Goldwyn Rodrigues
@ 2018-06-14 15:12 ` Goldwyn Rodrigues
0 siblings, 0 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-06-14 15:12 UTC (permalink / raw)
To: viro; +Cc: linux-fsdevel, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
Preparatory patch to carve out do_copy_file_range() from
vfs_copy_file_range
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
fs/read_write.c | 60 ++++++++++++++++++++++++++++++++-------------------------
1 file changed, 34 insertions(+), 26 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index c4eabbfc90df..525f2a67e15a 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1541,6 +1541,38 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
}
#endif
+static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ size_t len, unsigned int flags)
+{
+ ssize_t ret = 0;
+
+ if (len == 0)
+ return 0;
+
+ /*
+ * Try cloning first, this is supported by more file systems, and
+ * more efficient if both clone and copy are supported (e.g. NFS).
+ */
+ if (file_in->f_op->clone_file_range) {
+ ret = file_in->f_op->clone_file_range(file_in, pos_in,
+ file_out, pos_out, len);
+ if (ret == 0)
+ return len;
+ }
+
+ if (file_out->f_op->copy_file_range) {
+ ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
+ pos_out, len, flags);
+ if (ret != -EOPNOTSUPP)
+ return ret;
+ }
+
+ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
+ len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ return ret;
+}
+
/*
* copy_file_range() differs from regular file read and write in that it
* specifically allows return partial success. When it does so is up to
@@ -1579,35 +1611,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (inode_in->i_sb != inode_out->i_sb)
return -EXDEV;
- if (len == 0)
- return 0;
-
file_start_write(file_out);
- /*
- * Try cloning first, this is supported by more file systems, and
- * more efficient if both clone and copy are supported (e.g. NFS).
- */
- if (file_in->f_op->clone_file_range) {
- ret = file_in->f_op->clone_file_range(file_in, pos_in,
- file_out, pos_out, len);
- if (ret == 0) {
- ret = len;
- goto done;
- }
- }
-
- if (file_out->f_op->copy_file_range) {
- ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
- pos_out, len, flags);
- if (ret != -EOPNOTSUPP)
- goto done;
- }
-
- ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
- len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ ret = do_copy_file_range(file_in, pos_in,
+ file_out, pos_out, len, flags);
-done:
if (ret > 0) {
fsnotify_access(file_in);
add_rchar(current, ret);
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range
2018-05-14 14:56 [PATCH v3 0/4] Enable holes in copy_file_range() Goldwyn Rodrigues
@ 2018-05-14 14:56 ` Goldwyn Rodrigues
0 siblings, 0 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-14 14:56 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, linux-unionfs, david, amir73il, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
Preparatory patch to carve out do_copy_file_range() from
vfs_copy_file_range
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
---
fs/read_write.c | 60 ++++++++++++++++++++++++++++++++-------------------------
1 file changed, 34 insertions(+), 26 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index c4eabbfc90df..525f2a67e15a 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1541,6 +1541,38 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
}
#endif
+static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ size_t len, unsigned int flags)
+{
+ ssize_t ret = 0;
+
+ if (len == 0)
+ return 0;
+
+ /*
+ * Try cloning first, this is supported by more file systems, and
+ * more efficient if both clone and copy are supported (e.g. NFS).
+ */
+ if (file_in->f_op->clone_file_range) {
+ ret = file_in->f_op->clone_file_range(file_in, pos_in,
+ file_out, pos_out, len);
+ if (ret == 0)
+ return len;
+ }
+
+ if (file_out->f_op->copy_file_range) {
+ ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
+ pos_out, len, flags);
+ if (ret != -EOPNOTSUPP)
+ return ret;
+ }
+
+ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
+ len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ return ret;
+}
+
/*
* copy_file_range() differs from regular file read and write in that it
* specifically allows return partial success. When it does so is up to
@@ -1579,35 +1611,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (inode_in->i_sb != inode_out->i_sb)
return -EXDEV;
- if (len == 0)
- return 0;
-
file_start_write(file_out);
- /*
- * Try cloning first, this is supported by more file systems, and
- * more efficient if both clone and copy are supported (e.g. NFS).
- */
- if (file_in->f_op->clone_file_range) {
- ret = file_in->f_op->clone_file_range(file_in, pos_in,
- file_out, pos_out, len);
- if (ret == 0) {
- ret = len;
- goto done;
- }
- }
-
- if (file_out->f_op->copy_file_range) {
- ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
- pos_out, len, flags);
- if (ret != -EOPNOTSUPP)
- goto done;
- }
-
- ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
- len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ ret = do_copy_file_range(file_in, pos_in,
+ file_out, pos_out, len, flags);
-done:
if (ret > 0) {
fsnotify_access(file_in);
add_rchar(current, ret);
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range
2018-05-10 1:58 [PATCH v2 0/4] Enable holes in copy_file_range() Goldwyn Rodrigues
@ 2018-05-10 1:58 ` Goldwyn Rodrigues
0 siblings, 0 replies; 17+ messages in thread
From: Goldwyn Rodrigues @ 2018-05-10 1:58 UTC (permalink / raw)
To: linux-fsdevel; +Cc: hch, linux-unionfs, david, viro, Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
Preparatory patch to carve out do_copy_file_range() from
vfs_copy_file_range
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/read_write.c | 60 ++++++++++++++++++++++++++++++++-------------------------
1 file changed, 34 insertions(+), 26 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index c4eabbfc90df..525f2a67e15a 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1541,6 +1541,38 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
}
#endif
+static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ size_t len, unsigned int flags)
+{
+ ssize_t ret = 0;
+
+ if (len == 0)
+ return 0;
+
+ /*
+ * Try cloning first, this is supported by more file systems, and
+ * more efficient if both clone and copy are supported (e.g. NFS).
+ */
+ if (file_in->f_op->clone_file_range) {
+ ret = file_in->f_op->clone_file_range(file_in, pos_in,
+ file_out, pos_out, len);
+ if (ret == 0)
+ return len;
+ }
+
+ if (file_out->f_op->copy_file_range) {
+ ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
+ pos_out, len, flags);
+ if (ret != -EOPNOTSUPP)
+ return ret;
+ }
+
+ ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
+ len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ return ret;
+}
+
/*
* copy_file_range() differs from regular file read and write in that it
* specifically allows return partial success. When it does so is up to
@@ -1579,35 +1611,11 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (inode_in->i_sb != inode_out->i_sb)
return -EXDEV;
- if (len == 0)
- return 0;
-
file_start_write(file_out);
- /*
- * Try cloning first, this is supported by more file systems, and
- * more efficient if both clone and copy are supported (e.g. NFS).
- */
- if (file_in->f_op->clone_file_range) {
- ret = file_in->f_op->clone_file_range(file_in, pos_in,
- file_out, pos_out, len);
- if (ret == 0) {
- ret = len;
- goto done;
- }
- }
-
- if (file_out->f_op->copy_file_range) {
- ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
- pos_out, len, flags);
- if (ret != -EOPNOTSUPP)
- goto done;
- }
-
- ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
- len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
+ ret = do_copy_file_range(file_in, pos_in,
+ file_out, pos_out, len, flags);
-done:
if (ret > 0) {
fsnotify_access(file_in);
add_rchar(current, ret);
--
2.16.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-06-14 15:12 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08 21:24 [PATCH v1 0/5] Enable holes on copy_file_range() Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
2018-05-08 21:24 ` [PATCH 2/4] copy_file_range: Perform splice if in/out SB are not same Goldwyn Rodrigues
2018-05-08 21:57 ` Florian Weimer
2018-05-09 19:08 ` Goldwyn Rodrigues
2018-05-09 5:44 ` Amir Goldstein
2018-05-08 21:24 ` [PATCH 3/4] copy_file_range: splice with holes Goldwyn Rodrigues
2018-05-08 21:59 ` Florian Weimer
2018-05-08 23:43 ` Dave Chinner
2018-05-09 5:47 ` Amir Goldstein
2018-05-08 21:24 ` [PATCH 4/4] ovl: Use do_copy_file_range() in copy_up_data() Goldwyn Rodrigues
2018-05-09 5:50 ` Amir Goldstein
2018-05-09 19:13 ` Goldwyn Rodrigues
2018-05-10 4:52 ` Amir Goldstein
2018-05-10 1:58 [PATCH v2 0/4] Enable holes in copy_file_range() Goldwyn Rodrigues
2018-05-10 1:58 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
2018-05-14 14:56 [PATCH v3 0/4] Enable holes in copy_file_range() Goldwyn Rodrigues
2018-05-14 14:56 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
2018-06-14 15:12 [PATCH RESEND v3 0/4] Enable holes in copy_file_range() Goldwyn Rodrigues
2018-06-14 15:12 ` [PATCH 1/4] copy_file_range: refactor vfs_copy_file_range Goldwyn Rodrigues
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).