All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/3] simple copy offloading system call
@ 2015-04-10 22:00 Zach Brown
  2015-04-10 22:00 ` [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper Zach Brown
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-10 22:00 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs, linux-scsi

Hello everyone!

Here's my current attempt at the most basic system call interface for
offloading copying between files.  The system call and vfs function
are relatively light wrappers around the file_operation method that
does the heavy lifting.

There was interest at LSF in getting the basic infrastructure merged
before worrying about adding behavioural flags and more complicated
implementations.  This series only offers a refactoring of the btrfs
clone ioctl as an example of an implementation of the file
copy_file_range method.

I've added support for copy_file_range() to xfs_io in xfsprogs and
have the start of an xfstest that tests the system call.  I'll send
those to fstests@.

So how does this look?

Do we want to merge this and let the NFS and block XCOPY patches add
their changes when they're ready?

- z


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-10 22:00 [PATCH RFC 0/3] simple copy offloading system call Zach Brown
@ 2015-04-10 22:00 ` Zach Brown
  2015-04-10 22:36     ` Trond Myklebust
  2015-04-10 23:01   ` Andreas Dilger
  2015-04-10 22:00 ` [PATCH RFC 2/3] x86: add sys_copy_file_range to syscall tables Zach Brown
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-10 22:00 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs, linux-scsi

Add a copy_file_range() system call for offloading copies between
regular files.

This gives an interface to underlying layers of the storage stack which
can copy without reading and writing all the data.  There are a few
candidates that should support copy offloading in the nearer term:

- btrfs shares extent references with its clone ioctl
- NFS has patches to add a COPY command which copies on the server
- SCSI has a family of XCOPY commands which copy in the device

This system call avoids the complexity of also accelerating the creation
of the destination file by operating on an existing destination file
descriptor, not a path.

Currently the high level vfs entry point limits copy offloading to files
on the same mount and super (and not in the same file).  This can be
relaxed if we get implementations which can copy between file systems
safely.

Signed-off-by: Zach Brown <zab@redhat.com>
---
 fs/read_write.c                   | 129 ++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h                |   3 +
 include/uapi/asm-generic/unistd.h |   4 +-
 kernel/sys_ni.c                   |   1 +
 4 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 8e1b687..c65ce1d 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -17,6 +17,7 @@
 #include <linux/pagemap.h>
 #include <linux/splice.h>
 #include <linux/compat.h>
+#include <linux/mount.h>
 #include "internal.h"
 
 #include <asm/uaccess.h>
@@ -1424,3 +1425,131 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
 	return do_sendfile(out_fd, in_fd, NULL, count, 0);
 }
 #endif
+
+/*
+ * copy_file_range() differs from regular file read and write in that it
+ * specifically allows return partial success.  When it does so is up to
+ * the copy_file_range method.
+ */
+ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
+			    struct file *file_out, loff_t pos_out,
+			    size_t len, int flags)
+{
+	struct inode *inode_in;
+	struct inode *inode_out;
+	ssize_t ret;
+
+	if (flags)
+		return -EINVAL;
+
+	if (len == 0)
+		return 0;
+
+	/* copy_file_range allows full ssize_t len, ignoring MAX_RW_COUNT  */
+	ret = rw_verify_area(READ, file_in, &pos_in, len);
+	if (ret >= 0)
+		ret = rw_verify_area(WRITE, file_out, &pos_out, len);
+	if (ret < 0)
+		return ret;
+
+	if (!(file_in->f_mode & FMODE_READ) ||
+	    !(file_out->f_mode & FMODE_WRITE) ||
+	    (file_out->f_flags & O_APPEND) ||
+	    !file_in->f_op || !file_in->f_op->copy_file_range)
+		return -EINVAL;
+
+	inode_in = file_inode(file_in);
+	inode_out = file_inode(file_out);
+
+	/* make sure offsets don't wrap and the input is inside i_size */
+	if (pos_in + len < pos_in || pos_out + len < pos_out ||
+	    pos_in + len > i_size_read(inode_in))
+		return -EINVAL;
+
+	/* this could be relaxed once a method supports cross-fs copies */
+	if (inode_in->i_sb != inode_out->i_sb ||
+	    file_in->f_path.mnt != file_out->f_path.mnt)
+		return -EXDEV;
+
+	/* forbid ranges in the same file */
+	if (inode_in == inode_out)
+		return -EINVAL;
+
+	ret = mnt_want_write_file(file_out);
+	if (ret)
+		return ret;
+
+	ret = file_in->f_op->copy_file_range(file_in, pos_in, file_out, pos_out,
+					     len, flags);
+	if (ret > 0) {
+		fsnotify_access(file_in);
+		add_rchar(current, ret);
+		fsnotify_modify(file_out);
+		add_wchar(current, ret);
+	}
+	inc_syscr(current);
+	inc_syscw(current);
+
+	mnt_drop_write_file(file_out);
+
+	return ret;
+}
+EXPORT_SYMBOL(vfs_copy_file_range);
+
+SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
+		int, fd_out, loff_t __user *, off_out,
+		size_t, len, unsigned int, flags)
+{
+	loff_t pos_in;
+	loff_t pos_out;
+	struct fd f_in;
+	struct fd f_out;
+	ssize_t ret;
+
+	f_in = fdget(fd_in);
+	f_out = fdget(fd_out);
+	if (!f_in.file || !f_out.file) {
+		ret = -EBADF;
+		goto out;
+	}
+
+	ret = -EFAULT;
+	if (off_in) {
+		if (copy_from_user(&pos_in, off_in, sizeof(loff_t)))
+			goto out;
+	} else {
+		pos_in = f_in.file->f_pos;
+	}
+
+	if (off_out) {
+		if (copy_from_user(&pos_out, off_out, sizeof(loff_t)))
+			goto out;
+	} else {
+		pos_out = f_out.file->f_pos;
+	}
+
+	ret = vfs_copy_file_range(f_in.file, pos_in, f_out.file, pos_out, len,
+				  flags);
+	if (ret > 0) {
+		pos_in += ret;
+		pos_out += ret;
+
+		if (off_in) {
+			if (copy_to_user(off_in, &pos_in, sizeof(loff_t)))
+				ret = -EFAULT;
+		} else {
+			f_in.file->f_pos = pos_in;
+		}
+
+		if (off_out) {
+			if (copy_to_user(off_out, &pos_out, sizeof(loff_t)))
+				ret = -EFAULT;
+		} else {
+			f_out.file->f_pos = pos_out;
+		}
+	}
+out:
+	fdput(f_in);
+	fdput(f_out);
+	return ret;
+}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f4131e8..43a66d45 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1570,6 +1570,7 @@ struct file_operations {
 #ifndef CONFIG_MMU
 	unsigned (*mmap_capabilities)(struct file *);
 #endif
+	ssize_t (*copy_file_range)(struct file *, loff_t, struct file *, loff_t, size_t, int);
 };
 
 struct inode_operations {
@@ -1623,6 +1624,8 @@ extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
 		unsigned long, loff_t *);
 extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
 		unsigned long, loff_t *);
+extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
+				   loff_t, size_t, int);
 
 struct super_operations {
    	struct inode *(*alloc_inode)(struct super_block *sb);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index e016bd9..2b60f0c 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -709,9 +709,11 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
 __SYSCALL(__NR_bpf, sys_bpf)
 #define __NR_execveat 281
 __SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
+#define __NR_copy_file_range 282
+__SYSCALL(__NR_copy_file_range, sys_copy_file_range)
 
 #undef __NR_syscalls
-#define __NR_syscalls 282
+#define __NR_syscalls 283
 
 /*
  * All syscalls below here should go away really,
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 5adcb0a..07f4585 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -159,6 +159,7 @@ cond_syscall(sys_uselib);
 cond_syscall(sys_fadvise64);
 cond_syscall(sys_fadvise64_64);
 cond_syscall(sys_madvise);
+cond_syscall(sys_copy_file_range);
 
 /* arch-specific weak syscall entries */
 cond_syscall(sys_pciconfig_read);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 2/3] x86: add sys_copy_file_range to syscall tables
  2015-04-10 22:00 [PATCH RFC 0/3] simple copy offloading system call Zach Brown
  2015-04-10 22:00 ` [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper Zach Brown
@ 2015-04-10 22:00 ` Zach Brown
  2015-04-10 22:00 ` [PATCH RFC 3/3] btrfs: add .copy_file_range file operation Zach Brown
  2015-05-06  6:15   ` Michael Kerrisk
  3 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-10 22:00 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs, linux-scsi

Add sys_copy_file_range to the x86 syscall tables.

Signed-off-by: Zach Brown <zab@redhat.com>
---
 arch/x86/syscalls/syscall_32.tbl | 1 +
 arch/x86/syscalls/syscall_64.tbl | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index b3560ec..88d0025 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -365,3 +365,4 @@
 356	i386	memfd_create		sys_memfd_create
 357	i386	bpf			sys_bpf
 358	i386	execveat		sys_execveat			stub32_execveat
+359	i386	copy_file_range		sys_copy_file_range
diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index 8d656fb..81802c5 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -329,6 +329,7 @@
 320	common	kexec_file_load		sys_kexec_file_load
 321	common	bpf			sys_bpf
 322	64	execveat		stub_execveat
+323	common	copy_file_range		sys_copy_file_range
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH RFC 3/3] btrfs: add .copy_file_range file operation
  2015-04-10 22:00 [PATCH RFC 0/3] simple copy offloading system call Zach Brown
  2015-04-10 22:00 ` [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper Zach Brown
  2015-04-10 22:00 ` [PATCH RFC 2/3] x86: add sys_copy_file_range to syscall tables Zach Brown
@ 2015-04-10 22:00 ` Zach Brown
  2015-04-14 17:08     ` Chris Mason
  2015-05-06  6:15   ` Michael Kerrisk
  3 siblings, 1 reply; 33+ messages in thread
From: Zach Brown @ 2015-04-10 22:00 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs, linux-scsi

This rearranges the existing COPY_RANGE ioctl implementation so that the
.copy_file_range file operation can call the core loop that copies file
data extent items.

The extent copying loop is lifted up into its own function.  It retains
the core btrfs error checks that should be shared.

Signed-off-by: Zach Brown <zab@redhat.com>
---
 fs/btrfs/ctree.h |  3 ++
 fs/btrfs/file.c  |  1 +
 fs/btrfs/ioctl.c | 91 ++++++++++++++++++++++++++++++++------------------------
 3 files changed, 56 insertions(+), 39 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index f9c89ca..f7cfa26 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3958,6 +3958,9 @@ int btrfs_dirty_pages(struct btrfs_root *root, struct inode *inode,
 		      loff_t pos, size_t write_bytes,
 		      struct extent_state **cached);
 int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end);
+ssize_t btrfs_copy_file_range(struct file *file_in, loff_t pos_in,
+			      struct file *file_out, loff_t pos_out,
+			      size_t len, int flags);
 
 /* tree-defrag.c */
 int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 30982bb..49989899 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2820,6 +2820,7 @@ const struct file_operations btrfs_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= btrfs_ioctl,
 #endif
+	.copy_file_range = btrfs_copy_file_range,
 };
 
 void btrfs_auto_defrag_exit(void)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 74609b9..0eb008e 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3537,17 +3537,16 @@ out:
 	return ret;
 }
 
-static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd,
-				       u64 off, u64 olen, u64 destoff)
+static noinline int btrfs_clone_files(struct file *file, struct file *file_src,
+					u64 off, u64 olen, u64 destoff)
 {
 	struct inode *inode = file_inode(file);
+	struct inode *src = file_inode(file_src);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
-	struct fd src_file;
-	struct inode *src;
 	int ret;
 	u64 len = olen;
 	u64 bs = root->fs_info->sb->s_blocksize;
-	int same_inode = 0;
+	int same_inode = src == inode;
 
 	/*
 	 * TODO:
@@ -3560,49 +3559,20 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd,
 	 *   be either compressed or non-compressed.
 	 */
 
-	/* the destination must be opened for writing */
-	if (!(file->f_mode & FMODE_WRITE) || (file->f_flags & O_APPEND))
-		return -EINVAL;
-
 	if (btrfs_root_readonly(root))
 		return -EROFS;
 
-	ret = mnt_want_write_file(file);
-	if (ret)
-		return ret;
-
-	src_file = fdget(srcfd);
-	if (!src_file.file) {
-		ret = -EBADF;
-		goto out_drop_write;
-	}
-
-	ret = -EXDEV;
-	if (src_file.file->f_path.mnt != file->f_path.mnt)
-		goto out_fput;
-
-	src = file_inode(src_file.file);
-
-	ret = -EINVAL;
-	if (src == inode)
-		same_inode = 1;
-
-	/* the src must be open for reading */
-	if (!(src_file.file->f_mode & FMODE_READ))
-		goto out_fput;
+	if (file_src->f_path.mnt != file->f_path.mnt ||
+	    src->i_sb != inode->i_sb)
+		return -EXDEV;
 
 	/* don't make the dst file partly checksummed */
 	if ((BTRFS_I(src)->flags & BTRFS_INODE_NODATASUM) !=
 	    (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM))
-		goto out_fput;
+		return -EINVAL;
 
-	ret = -EISDIR;
 	if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode))
-		goto out_fput;
-
-	ret = -EXDEV;
-	if (src->i_sb != inode->i_sb)
-		goto out_fput;
+		return -EISDIR;
 
 	if (!same_inode) {
 		if (inode < src) {
@@ -3690,6 +3660,49 @@ out_unlock:
 	} else {
 		mutex_unlock(&src->i_mutex);
 	}
+	return ret;
+}
+
+ssize_t btrfs_copy_file_range(struct file *file_in, loff_t pos_in,
+			      struct file *file_out, loff_t pos_out,
+			      size_t len, int flags)
+{
+	ssize_t ret;
+
+	ret = btrfs_clone_files(file_out, file_in, pos_in, len, pos_out);
+	if (ret == 0)
+		ret = len;
+	return ret;
+}
+
+static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd,
+				       u64 off, u64 olen, u64 destoff)
+{
+	struct fd src_file;
+	int ret;
+
+	/* the destination must be opened for writing */
+	if (!(file->f_mode & FMODE_WRITE) || (file->f_flags & O_APPEND))
+		return -EINVAL;
+
+	ret = mnt_want_write_file(file);
+	if (ret)
+		return ret;
+
+	src_file = fdget(srcfd);
+	if (!src_file.file) {
+		ret = -EBADF;
+		goto out_drop_write;
+	}
+
+	/* the src must be open for reading */
+	if (!(src_file.file->f_mode & FMODE_READ)) {
+		ret = -EINVAL;
+		goto out_fput;
+	}
+
+	ret = btrfs_clone_files(file, src_file.file, off, olen, destoff);
+
 out_fput:
 	fdput(src_file);
 out_drop_write:
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-10 22:36     ` Trond Myklebust
  0 siblings, 0 replies; 33+ messages in thread
From: Trond Myklebust @ 2015-04-10 22:36 UTC (permalink / raw)
  To: Zach Brown
  Cc: Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

Hi Zach,

On Fri, Apr 10, 2015 at 6:00 PM, Zach Brown <zab@redhat.com> wrote:
> Add a copy_file_range() system call for offloading copies between
> regular files.
>
> This gives an interface to underlying layers of the storage stack which
> can copy without reading and writing all the data.  There are a few
> candidates that should support copy offloading in the nearer term:
>
> - btrfs shares extent references with its clone ioctl
> - NFS has patches to add a COPY command which copies on the server
> - SCSI has a family of XCOPY commands which copy in the device
>
> This system call avoids the complexity of also accelerating the creation
> of the destination file by operating on an existing destination file
> descriptor, not a path.
>
> Currently the high level vfs entry point limits copy offloading to files
> on the same mount and super (and not in the same file).  This can be
> relaxed if we get implementations which can copy between file systems
> safely.
>
> Signed-off-by: Zach Brown <zab@redhat.com>
> ---
>  fs/read_write.c                   | 129 ++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h                |   3 +
>  include/uapi/asm-generic/unistd.h |   4 +-
>  kernel/sys_ni.c                   |   1 +
>  4 files changed, 136 insertions(+), 1 deletion(-)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 8e1b687..c65ce1d 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -17,6 +17,7 @@
>  #include <linux/pagemap.h>
>  #include <linux/splice.h>
>  #include <linux/compat.h>
> +#include <linux/mount.h>
>  #include "internal.h"
>
>  #include <asm/uaccess.h>
> @@ -1424,3 +1425,131 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
>         return do_sendfile(out_fd, in_fd, NULL, count, 0);
>  }
>  #endif
> +
> +/*
> + * copy_file_range() differs from regular file read and write in that it
> + * specifically allows return partial success.  When it does so is up to
> + * the copy_file_range method.
> + */
> +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> +                           struct file *file_out, loff_t pos_out,
> +                           size_t len, int flags)

I'm going to repeat a gripe with this interface. I really don't think
we should treat copy_file_range() as taking a size_t length, since
that is not sufficient to do a full file copy on 32-bit systems w/ LFS
support.

Could we perhaps instead of a length, define a 'pos_in_start' and a
'pos_in_end' offset (with the latter being -1 for a full-file copy)
and then return an 'loff_t' value stating where the copy ended?

Note that both btrfs and NFSv4.2 allow for 64-bit lengths, so this
interface would be closer to what is already in use anyway.

Cheers
  Trond

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-10 22:36     ` Trond Myklebust
  0 siblings, 0 replies; 33+ messages in thread
From: Trond Myklebust @ 2015-04-10 22:36 UTC (permalink / raw)
  To: Zach Brown
  Cc: Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

Hi Zach,

On Fri, Apr 10, 2015 at 6:00 PM, Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Add a copy_file_range() system call for offloading copies between
> regular files.
>
> This gives an interface to underlying layers of the storage stack which
> can copy without reading and writing all the data.  There are a few
> candidates that should support copy offloading in the nearer term:
>
> - btrfs shares extent references with its clone ioctl
> - NFS has patches to add a COPY command which copies on the server
> - SCSI has a family of XCOPY commands which copy in the device
>
> This system call avoids the complexity of also accelerating the creation
> of the destination file by operating on an existing destination file
> descriptor, not a path.
>
> Currently the high level vfs entry point limits copy offloading to files
> on the same mount and super (and not in the same file).  This can be
> relaxed if we get implementations which can copy between file systems
> safely.
>
> Signed-off-by: Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  fs/read_write.c                   | 129 ++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h                |   3 +
>  include/uapi/asm-generic/unistd.h |   4 +-
>  kernel/sys_ni.c                   |   1 +
>  4 files changed, 136 insertions(+), 1 deletion(-)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 8e1b687..c65ce1d 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -17,6 +17,7 @@
>  #include <linux/pagemap.h>
>  #include <linux/splice.h>
>  #include <linux/compat.h>
> +#include <linux/mount.h>
>  #include "internal.h"
>
>  #include <asm/uaccess.h>
> @@ -1424,3 +1425,131 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
>         return do_sendfile(out_fd, in_fd, NULL, count, 0);
>  }
>  #endif
> +
> +/*
> + * copy_file_range() differs from regular file read and write in that it
> + * specifically allows return partial success.  When it does so is up to
> + * the copy_file_range method.
> + */
> +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> +                           struct file *file_out, loff_t pos_out,
> +                           size_t len, int flags)

I'm going to repeat a gripe with this interface. I really don't think
we should treat copy_file_range() as taking a size_t length, since
that is not sufficient to do a full file copy on 32-bit systems w/ LFS
support.

Could we perhaps instead of a length, define a 'pos_in_start' and a
'pos_in_end' offset (with the latter being -1 for a full-file copy)
and then return an 'loff_t' value stating where the copy ended?

Note that both btrfs and NFSv4.2 allow for 64-bit lengths, so this
interface would be closer to what is already in use anyway.

Cheers
  Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-10 22:00 ` [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper Zach Brown
  2015-04-10 22:36     ` Trond Myklebust
@ 2015-04-10 23:01   ` Andreas Dilger
  1 sibling, 0 replies; 33+ messages in thread
From: Andreas Dilger @ 2015-04-10 23:01 UTC (permalink / raw)
  To: Zach Brown
  Cc: linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs, linux-scsi

On Apr 10, 2015, at 4:00 PM, Zach Brown <zab@redhat.com> wrote:
> 
> Add a copy_file_range() system call for offloading copies between
> regular files.
> 
> This gives an interface to underlying layers of the storage stack which
> can copy without reading and writing all the data.  There are a few
> candidates that should support copy offloading in the nearer term:
> 
> - btrfs shares extent references with its clone ioctl
> - NFS has patches to add a COPY command which copies on the server
> - SCSI has a family of XCOPY commands which copy in the device
> 
> This system call avoids the complexity of also accelerating the creation
> of the destination file by operating on an existing destination file
> descriptor, not a path.
> 
> Currently the high level vfs entry point limits copy offloading to files
> on the same mount and super (and not in the same file).  This can be
> relaxed if we get implementations which can copy between file systems
> safely.
> 
> Signed-off-by: Zach Brown <zab@redhat.com>
> ---
> fs/read_write.c                   | 129 ++++++++++++++++++++++++++++++++++++++
> include/linux/fs.h                |   3 +
> include/uapi/asm-generic/unistd.h |   4 +-
> kernel/sys_ni.c                   |   1 +
> 4 files changed, 136 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 8e1b687..c65ce1d 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -17,6 +17,7 @@
> #include <linux/pagemap.h>
> #include <linux/splice.h>
> #include <linux/compat.h>
> +#include <linux/mount.h>
> #include "internal.h"
> 
> #include <asm/uaccess.h>
> @@ -1424,3 +1425,131 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
> 	return do_sendfile(out_fd, in_fd, NULL, count, 0);
> }
> #endif
> +
> +/*
> + * copy_file_range() differs from regular file read and write in that it
> + * specifically allows return partial success.  When it does so is up to
> + * the copy_file_range method.
> + */
> +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> +			    struct file *file_out, loff_t pos_out,
> +			    size_t len, int flags)

Minor nit - flags should be unsigned int to match the syscall.

> +{
> +	struct inode *inode_in;
> +	struct inode *inode_out;
> +	ssize_t ret;
> +
> +	if (flags)
> +		return -EINVAL;
> +
> +	if (len == 0)
> +		return 0;
> +
> +	/* copy_file_range allows full ssize_t len, ignoring MAX_RW_COUNT  */

This says "ssize_t", but the len parameter is "size_t"...

> +	ret = rw_verify_area(READ, file_in, &pos_in, len);
> +	if (ret >= 0)
> +		ret = rw_verify_area(WRITE, file_out, &pos_out, len);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (!(file_in->f_mode & FMODE_READ) ||
> +	    !(file_out->f_mode & FMODE_WRITE) ||
> +	    (file_out->f_flags & O_APPEND) ||
> +	    !file_in->f_op || !file_in->f_op->copy_file_range)
> +		return -EINVAL;
> +
> +	inode_in = file_inode(file_in);
> +	inode_out = file_inode(file_out);
> +
> +	/* make sure offsets don't wrap and the input is inside i_size */
> +	if (pos_in + len < pos_in || pos_out + len < pos_out ||
> +	    pos_in + len > i_size_read(inode_in))
> +		return -EINVAL;
> +
> +	/* this could be relaxed once a method supports cross-fs copies */
> +	if (inode_in->i_sb != inode_out->i_sb ||
> +	    file_in->f_path.mnt != file_out->f_path.mnt)
> +		return -EXDEV;
> +
> +	/* forbid ranges in the same file */
> +	if (inode_in == inode_out)
> +		return -EINVAL;
> +
> +	ret = mnt_want_write_file(file_out);
> +	if (ret)
> +		return ret;
> +
> +	ret = file_in->f_op->copy_file_range(file_in, pos_in, file_out, pos_out,
> +					     len, flags);
> +	if (ret > 0) {
> +		fsnotify_access(file_in);
> +		add_rchar(current, ret);
> +		fsnotify_modify(file_out);
> +		add_wchar(current, ret);
> +	}
> +	inc_syscr(current);
> +	inc_syscw(current);
> +
> +	mnt_drop_write_file(file_out);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(vfs_copy_file_range);
> +
> +SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
> +		int, fd_out, loff_t __user *, off_out,
> +		size_t, len, unsigned int, flags)
> +{
> +	loff_t pos_in;
> +	loff_t pos_out;
> +	struct fd f_in;
> +	struct fd f_out;
> +	ssize_t ret;
> +
> +	f_in = fdget(fd_in);
> +	f_out = fdget(fd_out);
> +	if (!f_in.file || !f_out.file) {
> +		ret = -EBADF;
> +		goto out;
> +	}
> +
> +	ret = -EFAULT;
> +	if (off_in) {
> +		if (copy_from_user(&pos_in, off_in, sizeof(loff_t)))
> +			goto out;
> +	} else {
> +		pos_in = f_in.file->f_pos;
> +	}
> +
> +	if (off_out) {
> +		if (copy_from_user(&pos_out, off_out, sizeof(loff_t)))
> +			goto out;
> +	} else {
> +		pos_out = f_out.file->f_pos;
> +	}
> +
> +	ret = vfs_copy_file_range(f_in.file, pos_in, f_out.file, pos_out, len,
> +				  flags);
> +	if (ret > 0) {
> +		pos_in += ret;
> +		pos_out += ret;
> +
> +		if (off_in) {
> +			if (copy_to_user(off_in, &pos_in, sizeof(loff_t)))
> +				ret = -EFAULT;
> +		} else {
> +			f_in.file->f_pos = pos_in;
> +		}
> +
> +		if (off_out) {
> +			if (copy_to_user(off_out, &pos_out, sizeof(loff_t)))
> +				ret = -EFAULT;
> +		} else {
> +			f_out.file->f_pos = pos_out;
> +		}
> +	}
> +out:
> +	fdput(f_in);
> +	fdput(f_out);
> +	return ret;
> +}
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f4131e8..43a66d45 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1570,6 +1570,7 @@ struct file_operations {
> #ifndef CONFIG_MMU
> 	unsigned (*mmap_capabilities)(struct file *);
> #endif
> +	ssize_t (*copy_file_range)(struct file *, loff_t, struct file *, loff_t, size_t, int);

This should also be unsigned int for the flags parameter.

> };
> 
> struct inode_operations {
> @@ -1623,6 +1624,8 @@ extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
> 		unsigned long, loff_t *);
> extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
> 		unsigned long, loff_t *);
> +extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
> +				   loff_t, size_t, int);
> 
> struct super_operations {
>    	struct inode *(*alloc_inode)(struct super_block *sb);
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index e016bd9..2b60f0c 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -709,9 +709,11 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
> __SYSCALL(__NR_bpf, sys_bpf)
> #define __NR_execveat 281
> __SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
> +#define __NR_copy_file_range 282
> +__SYSCALL(__NR_copy_file_range, sys_copy_file_range)
> 
> #undef __NR_syscalls
> -#define __NR_syscalls 282
> +#define __NR_syscalls 283
> 
> /*
>  * All syscalls below here should go away really,
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 5adcb0a..07f4585 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -159,6 +159,7 @@ cond_syscall(sys_uselib);
> cond_syscall(sys_fadvise64);
> cond_syscall(sys_fadvise64_64);
> cond_syscall(sys_madvise);
> +cond_syscall(sys_copy_file_range);
> 
> /* arch-specific weak syscall entries */
> cond_syscall(sys_pciconfig_read);
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-10 22:36     ` Trond Myklebust
  (?)
@ 2015-04-11  0:02     ` Zach Brown
  2015-04-11  0:24       ` Trond Myklebust
  -1 siblings, 1 reply; 33+ messages in thread
From: Zach Brown @ 2015-04-11  0:02 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Fri, Apr 10, 2015 at 06:36:41PM -0400, Trond Myklebust wrote:
> On Fri, Apr 10, 2015 at 6:00 PM, Zach Brown <zab@redhat.com> wrote:

> > +
> > +/*
> > + * copy_file_range() differs from regular file read and write in that it
> > + * specifically allows return partial success.  When it does so is up to
> > + * the copy_file_range method.
> > + */
> > +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> > +                           struct file *file_out, loff_t pos_out,
> > +                           size_t len, int flags)
> 
> I'm going to repeat a gripe with this interface. I really don't think
> we should treat copy_file_range() as taking a size_t length, since
> that is not sufficient to do a full file copy on 32-bit systems w/ LFS
> support.

*nod*.  The length type is limited by the syscall return type and the
arbitrary desire to mimic read/write.

I sympathize with wanting to copy giant files with operations that don't
scale with file size because files can be enormous but sparse.

> Could we perhaps instead of a length, define a 'pos_in_start' and a
> 'pos_in_end' offset (with the latter being -1 for a full-file copy)
> and then return an 'loff_t' value stating where the copy ended?

Well, the resulting offset will be set if the caller provided it.  So
they could already be getting the copied length from that.  But they
might not specify the offsets.  Maybe they're just using the results to
total up a completion indicator.

Maybe we could make the length a pointer like the offsets that's set to
the copied length on return.

This all seems pretty gross.  Does anyone else have a vote?

(And I'll argue strongly against creating magical offset values that
change behaviour.  If we want to ignore arguments and get the length
from the source file we'd add a flag to do so.)

> Note that both btrfs and NFSv4.2 allow for 64-bit lengths, so this
> interface would be closer to what is already in use anyway.

Yeah, btrfs doesn't allow partial progress.  It returns 0 on success.
We could also do that but people have expressed an interest in returning
partial progress.

- z

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-11  0:02     ` Zach Brown
@ 2015-04-11  0:24       ` Trond Myklebust
  2015-04-11 13:04         ` Jeff Layton
  0 siblings, 1 reply; 33+ messages in thread
From: Trond Myklebust @ 2015-04-11  0:24 UTC (permalink / raw)
  To: Zach Brown
  Cc: Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Fri, Apr 10, 2015 at 8:02 PM, Zach Brown <zab@redhat.com> wrote:
> On Fri, Apr 10, 2015 at 06:36:41PM -0400, Trond Myklebust wrote:
>> On Fri, Apr 10, 2015 at 6:00 PM, Zach Brown <zab@redhat.com> wrote:
>
>> > +
>> > +/*
>> > + * copy_file_range() differs from regular file read and write in that it
>> > + * specifically allows return partial success.  When it does so is up to
>> > + * the copy_file_range method.
>> > + */
>> > +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>> > +                           struct file *file_out, loff_t pos_out,
>> > +                           size_t len, int flags)
>>
>> I'm going to repeat a gripe with this interface. I really don't think
>> we should treat copy_file_range() as taking a size_t length, since
>> that is not sufficient to do a full file copy on 32-bit systems w/ LFS
>> support.
>
> *nod*.  The length type is limited by the syscall return type and the
> arbitrary desire to mimic read/write.
>
> I sympathize with wanting to copy giant files with operations that don't
> scale with file size because files can be enormous but sparse.

The other argument against using a size_t is that there is no memory
buffer involved here. size_t is, after all, a type describing
in-memory objects, not files.

>> Could we perhaps instead of a length, define a 'pos_in_start' and a
>> 'pos_in_end' offset (with the latter being -1 for a full-file copy)
>> and then return an 'loff_t' value stating where the copy ended?
>
> Well, the resulting offset will be set if the caller provided it.  So
> they could already be getting the copied length from that.  But they
> might not specify the offsets.  Maybe they're just using the results to
> total up a completion indicator.
>
> Maybe we could make the length a pointer like the offsets that's set to
> the copied length on return.

That works, but why do we care so much about the difference between a
length and an offset as a return value?

To be fair, the NFS copy offload also allows the copy to proceed out
of order, in which case the range of copied data could be
non-contiguous in the case of a failure. However neither the length
nor the offset case will give you the full story in that case. Any
return value can at best be considered to define an offset range whose
contents need to be checked for success/failure.

> This all seems pretty gross.  Does anyone else have a vote?
>
> (And I'll argue strongly against creating magical offset values that
> change behaviour.  If we want to ignore arguments and get the length
> from the source file we'd add a flag to do so.)

The '-1' was not intended to be a special/magical value: as far as I'm
concerned any end offset that covers the full range of supported file
lengths would be OK.

>> Note that both btrfs and NFSv4.2 allow for 64-bit lengths, so this
>> interface would be closer to what is already in use anyway.
>
> Yeah, btrfs doesn't allow partial progress.  It returns 0 on success.
> We could also do that but people have expressed an interest in returning
> partial progress.

Returning an end offset would satisfy the partial progress requirement
(with the caveat mentioned above).

Cheers
  Trond

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-11  0:24       ` Trond Myklebust
@ 2015-04-11 13:04         ` Jeff Layton
  2015-04-13 16:32           ` Zach Brown
  2015-04-14 16:53           ` Christoph Hellwig
  0 siblings, 2 replies; 33+ messages in thread
From: Jeff Layton @ 2015-04-11 13:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Zach Brown, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs, Linux NFS Mailing List,
	linux-scsi

On Fri, 10 Apr 2015 20:24:06 -0400
Trond Myklebust <trond.myklebust@primarydata.com> wrote:

> On Fri, Apr 10, 2015 at 8:02 PM, Zach Brown <zab@redhat.com> wrote:
> > On Fri, Apr 10, 2015 at 06:36:41PM -0400, Trond Myklebust wrote:
> >> On Fri, Apr 10, 2015 at 6:00 PM, Zach Brown <zab@redhat.com> wrote:
> >
> >> > +
> >> > +/*
> >> > + * copy_file_range() differs from regular file read and write in that it
> >> > + * specifically allows return partial success.  When it does so is up to
> >> > + * the copy_file_range method.
> >> > + */
> >> > +ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> >> > +                           struct file *file_out, loff_t pos_out,
> >> > +                           size_t len, int flags)
> >>
> >> I'm going to repeat a gripe with this interface. I really don't think
> >> we should treat copy_file_range() as taking a size_t length, since
> >> that is not sufficient to do a full file copy on 32-bit systems w/ LFS
> >> support.
> >
> > *nod*.  The length type is limited by the syscall return type and the
> > arbitrary desire to mimic read/write.
> >
> > I sympathize with wanting to copy giant files with operations that don't
> > scale with file size because files can be enormous but sparse.
> 
> The other argument against using a size_t is that there is no memory
> buffer involved here. size_t is, after all, a type describing
> in-memory objects, not files.
> 
> >> Could we perhaps instead of a length, define a 'pos_in_start' and a
> >> 'pos_in_end' offset (with the latter being -1 for a full-file copy)
> >> and then return an 'loff_t' value stating where the copy ended?
> >
> > Well, the resulting offset will be set if the caller provided it.  So
> > they could already be getting the copied length from that.  But they
> > might not specify the offsets.  Maybe they're just using the results to
> > total up a completion indicator.
> >
> > Maybe we could make the length a pointer like the offsets that's set to
> > the copied length on return.
> 
> That works, but why do we care so much about the difference between a
> length and an offset as a return value?
> 

I think it just comes down to potential confusion for users. What's
more useful, the number of bytes actually copied, or the offset into the
file where the copy ended?

I tend to the think an offset is more useful for someone trying to
copy a file in chunks, particularly if the file is sparse. That gives
them a clear place to continue the copy.

So, I think I agree with Trond that phrasing this interface in terms of
file offsets seems like it might be more useful. That also neatly
sidesteps the size_t limitations on 32-bit platforms.

> To be fair, the NFS copy offload also allows the copy to proceed out
> of order, in which case the range of copied data could be
> non-contiguous in the case of a failure. However neither the length
> nor the offset case will give you the full story in that case. Any
> return value can at best be considered to define an offset range whose
> contents need to be checked for success/failure.
> 

Yuck! How the heck do you clean up the mess if that happens? I guess
you're just stuck redoing the copy with normal READ/WRITE?

Maybe we need to have the interface return a hard error in that
case and not try to give back any sort of offset?

> > This all seems pretty gross.  Does anyone else have a vote?
> >
> > (And I'll argue strongly against creating magical offset values that
> > change behaviour.  If we want to ignore arguments and get the length
> > from the source file we'd add a flag to do so.)
> 
> The '-1' was not intended to be a special/magical value: as far as I'm
> concerned any end offset that covers the full range of supported file
> lengths would be OK.
> 

Agreed. A "whole file" flag might also be useful too, but I'd leave
that for after the initial implementation is merged, just in the
interest of having _something_ that works in the near term.

> >> Note that both btrfs and NFSv4.2 allow for 64-bit lengths, so this
> >> interface would be closer to what is already in use anyway.
> >
> > Yeah, btrfs doesn't allow partial progress.  It returns 0 on success.
> > We could also do that but people have expressed an interest in returning
> > partial progress.
> 
> Returning an end offset would satisfy the partial progress requirement
> (with the caveat mentioned above).
> 

-- 
Jeff Layton <jlayton@poochiereds.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-11 13:04         ` Jeff Layton
@ 2015-04-13 16:32           ` Zach Brown
  2015-04-14 16:53           ` Christoph Hellwig
  1 sibling, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-13 16:32 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Trond Myklebust, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs, Linux NFS Mailing List,
	linux-scsi

> > >> Could we perhaps instead of a length, define a 'pos_in_start' and a
> > >> 'pos_in_end' offset (with the latter being -1 for a full-file copy)
> > >> and then return an 'loff_t' value stating where the copy ended?
> > >
> > > Well, the resulting offset will be set if the caller provided it.  So
> > > they could already be getting the copied length from that.  But they
> > > might not specify the offsets.  Maybe they're just using the results to
> > > total up a completion indicator.
> > >
> > > Maybe we could make the length a pointer like the offsets that's set to
> > > the copied length on return.
> > 
> > That works, but why do we care so much about the difference between a
> > length and an offset as a return value?
> > 
> 
> I think it just comes down to potential confusion for users. What's
> more useful, the number of bytes actually copied, or the offset into the
> file where the copy ended?
> 
> I tend to the think an offset is more useful for someone trying to
> copy a file in chunks, particularly if the file is sparse. That gives
> them a clear place to continue the copy.
> 
> So, I think I agree with Trond that phrasing this interface in terms of
> file offsets seems like it might be more useful. That also neatly
> sidesteps the size_t limitations on 32-bit platforms.

Yeah, fair enough.  I'll rework it.

> > To be fair, the NFS copy offload also allows the copy to proceed out
> > of order, in which case the range of copied data could be
> > non-contiguous in the case of a failure. However neither the length
> > nor the offset case will give you the full story in that case. Any
> > return value can at best be considered to define an offset range whose
> > contents need to be checked for success/failure.
> > 
> 
> Yuck! How the heck do you clean up the mess if that happens? I guess
> you're just stuck redoing the copy with normal READ/WRITE?

I don't think anyone will worry about checking file contents.

Yes, technically you can get fragmented completion past the initial
contiguous region that the interface told you is done.   You can get
that with O_DIRECT today.

But it's a rare case that is not worth worrying about.  You'll retry at
the contiguous offset until it doesn't make progress and then fall back
to read/write.

- z

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-11 13:04         ` Jeff Layton
  2015-04-13 16:32           ` Zach Brown
@ 2015-04-14 16:53           ` Christoph Hellwig
  2015-04-14 16:58             ` Christoph Hellwig
  2015-04-14 17:16               ` Anna Schumaker
  1 sibling, 2 replies; 33+ messages in thread
From: Christoph Hellwig @ 2015-04-14 16:53 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Trond Myklebust, Zach Brown, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs, Linux NFS Mailing List,
	linux-scsi

On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> Yuck! How the heck do you clean up the mess if that happens? I guess
> you're just stuck redoing the copy with normal READ/WRITE?
> 
> Maybe we need to have the interface return a hard error in that
> case and not try to give back any sort of offset?

The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
expect us to simply ignore it and only implement my new CLONE operation
with sane semantics.  That is unless someone can show some real life
use case for the inter server copy, in which case we'll have to deal
with that mess.  But getting that one right at the VFS level will
be a nightmare anyway.

Make this a vote from me to not support partial copies and just return
and error in that case.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
  2015-04-14 16:53           ` Christoph Hellwig
@ 2015-04-14 16:58             ` Christoph Hellwig
  2015-04-14 17:16               ` Anna Schumaker
  1 sibling, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2015-04-14 16:58 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Trond Myklebust, Zach Brown, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs, Linux NFS Mailing List,
	linux-scsi

On Tue, Apr 14, 2015 at 09:53:44AM -0700, Christoph Hellwig wrote:
> On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > Yuck! How the heck do you clean up the mess if that happens? I guess
> > you're just stuck redoing the copy with normal READ/WRITE?
> > 
> > Maybe we need to have the interface return a hard error in that
> > case and not try to give back any sort of offset?
> 
> The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> expect us to simply ignore it and only implement my new CLONE operation
> with sane semantics.  That is unless someone can show some real life
> use case for the inter server copy, in which case we'll have to deal
> with that mess.  But getting that one right at the VFS level will
> be a nightmare anyway.

Btw, in case someone cares about the NFS CLONE implementation here is
my prototype based on Anna's older COPY prototype.  It's simple enough that
it might be worth adding to the copy_file_range patch set.

http://git.infradead.org/users/hch/pnfs.git/shortlog/refs/heads/clone

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 3/3] btrfs: add .copy_file_range file operation
  2015-04-10 22:00 ` [PATCH RFC 3/3] btrfs: add .copy_file_range file operation Zach Brown
@ 2015-04-14 17:08     ` Chris Mason
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Mason @ 2015-04-14 17:08 UTC (permalink / raw)
  To: Zach Brown, linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs,
	linux-scsi

On 04/10/2015 06:00 PM, Zach Brown wrote:
> This rearranges the existing COPY_RANGE ioctl implementation so that the
> .copy_file_range file operation can call the core loop that copies file
> data extent items.
> 
> The extent copying loop is lifted up into its own function.  It retains
> the core btrfs error checks that should be shared.
> 

Thanks Zach, the btrfs bits look reasonable

Signed-off-by: Chris Mason <clm@fb.com>

-chris


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 3/3] btrfs: add .copy_file_range file operation
@ 2015-04-14 17:08     ` Chris Mason
  0 siblings, 0 replies; 33+ messages in thread
From: Chris Mason @ 2015-04-14 17:08 UTC (permalink / raw)
  To: Zach Brown, linux-kernel, linux-fsdevel, linux-btrfs, linux-nfs,
	linux-scsi

On 04/10/2015 06:00 PM, Zach Brown wrote:
> This rearranges the existing COPY_RANGE ioctl implementation so that the
> .copy_file_range file operation can call the core loop that copies file
> data extent items.
> 
> The extent copying loop is lifted up into its own function.  It retains
> the core btrfs error checks that should be shared.
> 

Thanks Zach, the btrfs bits look reasonable

Signed-off-by: Chris Mason <clm@fb.com>

-chris

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 17:16               ` Anna Schumaker
  0 siblings, 0 replies; 33+ messages in thread
From: Anna Schumaker @ 2015-04-14 17:16 UTC (permalink / raw)
  To: Christoph Hellwig, Jeff Layton
  Cc: Trond Myklebust, Zach Brown, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs, Linux NFS Mailing List,
	linux-scsi

On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
>> Yuck! How the heck do you clean up the mess if that happens? I guess
>> you're just stuck redoing the copy with normal READ/WRITE?
>>
>> Maybe we need to have the interface return a hard error in that
>> case and not try to give back any sort of offset?
> 
> The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> expect us to simply ignore it and only implement my new CLONE operation
> with sane semantics.  That is unless someone can show some real life
> use case for the inter server copy, in which case we'll have to deal
> with that mess.  But getting that one right at the VFS level will
> be a nightmare anyway.
> 
> Make this a vote from me to not support partial copies and just return
> and error in that case.

Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a ca_synchronous flags that let the client state if the copy should be done consecutively or synchronously.  I expected to always set consecutive to "true" for the Linux client.

Anna

> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 17:16               ` Anna Schumaker
  0 siblings, 0 replies; 33+ messages in thread
From: Anna Schumaker @ 2015-04-14 17:16 UTC (permalink / raw)
  To: Christoph Hellwig, Jeff Layton
  Cc: Trond Myklebust, Zach Brown, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	Linux NFS Mailing List, linux-scsi-u79uwXL29TY76Z2rM5mHXA

On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
>> Yuck! How the heck do you clean up the mess if that happens? I guess
>> you're just stuck redoing the copy with normal READ/WRITE?
>>
>> Maybe we need to have the interface return a hard error in that
>> case and not try to give back any sort of offset?
> 
> The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> expect us to simply ignore it and only implement my new CLONE operation
> with sane semantics.  That is unless someone can show some real life
> use case for the inter server copy, in which case we'll have to deal
> with that mess.  But getting that one right at the VFS level will
> be a nightmare anyway.
> 
> Make this a vote from me to not support partial copies and just return
> and error in that case.

Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a ca_synchronous flags that let the client state if the copy should be done consecutively or synchronously.  I expected to always set consecutive to "true" for the Linux client.

Anna

> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 17:16               ` Anna Schumaker
  0 siblings, 0 replies; 33+ messages in thread
From: Anna Schumaker @ 2015-04-14 17:16 UTC (permalink / raw)
  To: Christoph Hellwig, Jeff Layton
  Cc: Trond Myklebust, Zach Brown, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	Linux NFS Mailing List, linux-scsi-u79uwXL29TY76Z2rM5mHXA

On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
>> Yuck! How the heck do you clean up the mess if that happens? I guess
>> you're just stuck redoing the copy with normal READ/WRITE?
>>
>> Maybe we need to have the interface return a hard error in that
>> case and not try to give back any sort of offset?
> 
> The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> expect us to simply ignore it and only implement my new CLONE operation
> with sane semantics.  That is unless someone can show some real life
> use case for the inter server copy, in which case we'll have to deal
> with that mess.  But getting that one right at the VFS level will
> be a nightmare anyway.
> 
> Make this a vote from me to not support partial copies and just return
> and error in that case.

Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a ca_synchronous flags that let the client state if the copy should be done consecutively or synchronously.  I expected to always set consecutive to "true" for the Linux client.

Anna

> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:19                 ` J. Bruce Fields
  0 siblings, 0 replies; 33+ messages in thread
From: J. Bruce Fields @ 2015-04-14 18:19 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Christoph Hellwig, Jeff Layton, Trond Myklebust, Zach Brown,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> >> Yuck! How the heck do you clean up the mess if that happens? I
> >> guess you're just stuck redoing the copy with normal READ/WRITE?
> >>
> >> Maybe we need to have the interface return a hard error in that
> >> case and not try to give back any sort of offset?
> > 
> > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > expect us to simply ignore it and only implement my new CLONE
> > operation with sane semantics.  That is unless someone can show some
> > real life use case for the inter server copy, in which case we'll
> > have to deal with that mess.  But getting that one right at the VFS
> > level will be a nightmare anyway.
> > 
> > Make this a vote from me to not support partial copies and just
> > return and error in that case.
> 
> Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> ca_synchronous flags that let the client state if the copy should be
> done consecutively or synchronously.  I expected to always set
> consecutive to "true" for the Linux client.

That's supposed to mean results are well-defined in the partial-copy
case, but I think Christoph's suggesting eliminating the partial-copy
case entirely?

Which would be fine with me.

It might actually have been me advocating for partial copies.  But that
was only because a partial-copy-handling-loop seemed simpler to me than
progress callbacks if we were going to support long-running copies.

I'm happy enough not to have it at all.

--b.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:19                 ` J. Bruce Fields
  0 siblings, 0 replies; 33+ messages in thread
From: J. Bruce Fields @ 2015-04-14 18:19 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Christoph Hellwig, Jeff Layton, Trond Myklebust, Zach Brown,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> >> Yuck! How the heck do you clean up the mess if that happens? I
> >> guess you're just stuck redoing the copy with normal READ/WRITE?
> >>
> >> Maybe we need to have the interface return a hard error in that
> >> case and not try to give back any sort of offset?
> > 
> > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > expect us to simply ignore it and only implement my new CLONE
> > operation with sane semantics.  That is unless someone can show some
> > real life use case for the inter server copy, in which case we'll
> > have to deal with that mess.  But getting that one right at the VFS
> > level will be a nightmare anyway.
> > 
> > Make this a vote from me to not support partial copies and just
> > return and error in that case.
> 
> Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> ca_synchronous flags that let the client state if the copy should be
> done consecutively or synchronously.  I expected to always set
> consecutive to "true" for the Linux client.

That's supposed to mean results are well-defined in the partial-copy
case, but I think Christoph's suggesting eliminating the partial-copy
case entirely?

Which would be fine with me.

It might actually have been me advocating for partial copies.  But that
was only because a partial-copy-handling-loop seemed simpler to me than
progress callbacks if we were going to support long-running copies.

I'm happy enough not to have it at all.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:22                   ` Zach Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-14 18:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Anna Schumaker, Christoph Hellwig, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > >> Yuck! How the heck do you clean up the mess if that happens? I
> > >> guess you're just stuck redoing the copy with normal READ/WRITE?
> > >>
> > >> Maybe we need to have the interface return a hard error in that
> > >> case and not try to give back any sort of offset?
> > > 
> > > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > > expect us to simply ignore it and only implement my new CLONE
> > > operation with sane semantics.  That is unless someone can show some
> > > real life use case for the inter server copy, in which case we'll
> > > have to deal with that mess.  But getting that one right at the VFS
> > > level will be a nightmare anyway.
> > > 
> > > Make this a vote from me to not support partial copies and just
> > > return and error in that case.
> > 
> > Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> > ca_synchronous flags that let the client state if the copy should be
> > done consecutively or synchronously.  I expected to always set
> > consecutive to "true" for the Linux client.
> 
> That's supposed to mean results are well-defined in the partial-copy
> case, but I think Christoph's suggesting eliminating the partial-copy
> case entirely?
> 
> Which would be fine with me.
> 
> It might actually have been me advocating for partial copies.  But that
> was only because a partial-copy-handling-loop seemed simpler to me than
> progress callbacks if we were going to support long-running copies.
> 
> I'm happy enough not to have it at all.

Ah, OK, that's great news.

I thought at one point we were worried about very long running RPCs on
the server.  Are we not worried about that now?

Is the client expected to cut the work up into arbitrarily managable
chunks?  Is the server expected to fail COPY/CLONE requests that it
thinks would take way too long?  Something else?

- z

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:22                   ` Zach Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-14 18:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Anna Schumaker, Christoph Hellwig, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > >> Yuck! How the heck do you clean up the mess if that happens? I
> > >> guess you're just stuck redoing the copy with normal READ/WRITE?
> > >>
> > >> Maybe we need to have the interface return a hard error in that
> > >> case and not try to give back any sort of offset?
> > > 
> > > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > > expect us to simply ignore it and only implement my new CLONE
> > > operation with sane semantics.  That is unless someone can show some
> > > real life use case for the inter server copy, in which case we'll
> > > have to deal with that mess.  But getting that one right at the VFS
> > > level will be a nightmare anyway.
> > > 
> > > Make this a vote from me to not support partial copies and just
> > > return and error in that case.
> > 
> > Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> > ca_synchronous flags that let the client state if the copy should be
> > done consecutively or synchronously.  I expected to always set
> > consecutive to "true" for the Linux client.
> 
> That's supposed to mean results are well-defined in the partial-copy
> case, but I think Christoph's suggesting eliminating the partial-copy
> case entirely?
> 
> Which would be fine with me.
> 
> It might actually have been me advocating for partial copies.  But that
> was only because a partial-copy-handling-loop seemed simpler to me than
> progress callbacks if we were going to support long-running copies.
> 
> I'm happy enough not to have it at all.

Ah, OK, that's great news.

I thought at one point we were worried about very long running RPCs on
the server.  Are we not worried about that now?

Is the client expected to cut the work up into arbitrarily managable
chunks?  Is the server expected to fail COPY/CLONE requests that it
thinks would take way too long?  Something else?

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:29                     ` J. Bruce Fields
  0 siblings, 0 replies; 33+ messages in thread
From: J. Bruce Fields @ 2015-04-14 18:29 UTC (permalink / raw)
  To: Zach Brown
  Cc: Anna Schumaker, Christoph Hellwig, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
> On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> > On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > > >> Yuck! How the heck do you clean up the mess if that happens? I
> > > >> guess you're just stuck redoing the copy with normal READ/WRITE?
> > > >>
> > > >> Maybe we need to have the interface return a hard error in that
> > > >> case and not try to give back any sort of offset?
> > > > 
> > > > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > > > expect us to simply ignore it and only implement my new CLONE
> > > > operation with sane semantics.  That is unless someone can show some
> > > > real life use case for the inter server copy, in which case we'll
> > > > have to deal with that mess.  But getting that one right at the VFS
> > > > level will be a nightmare anyway.
> > > > 
> > > > Make this a vote from me to not support partial copies and just
> > > > return and error in that case.
> > > 
> > > Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> > > ca_synchronous flags that let the client state if the copy should be
> > > done consecutively or synchronously.  I expected to always set
> > > consecutive to "true" for the Linux client.
> > 
> > That's supposed to mean results are well-defined in the partial-copy
> > case, but I think Christoph's suggesting eliminating the partial-copy
> > case entirely?
> > 
> > Which would be fine with me.
> > 
> > It might actually have been me advocating for partial copies.  But that
> > was only because a partial-copy-handling-loop seemed simpler to me than
> > progress callbacks if we were going to support long-running copies.
> > 
> > I'm happy enough not to have it at all.
> 
> Ah, OK, that's great news.
> 
> I thought at one point we were worried about very long running RPCs on
> the server.  Are we not worried about that now?
> 
> Is the client expected to cut the work up into arbitrarily managable
> chunks?  Is the server expected to fail COPY/CLONE requests that it
> thinks would take way too long?  Something else?

Christoph is proposing a CLONE rpc that's required to be atomic:

	https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-35#section-15.13
	"The CLONE operation is atomic, that is either all changes or no
	changes are seen by the client or other clients."

So that couldn't be really long-running (or the server is nuts).

So that'd mean Anna would rip out the server-side copy loop and we'd
initially just support btrfs or whatever.

I mean the server-side copy loop may also be useful but I'm all for
wiring up the obvious case first.

--b.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:29                     ` J. Bruce Fields
  0 siblings, 0 replies; 33+ messages in thread
From: J. Bruce Fields @ 2015-04-14 18:29 UTC (permalink / raw)
  To: Zach Brown
  Cc: Anna Schumaker, Christoph Hellwig, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
> On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> > On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > > >> Yuck! How the heck do you clean up the mess if that happens? I
> > > >> guess you're just stuck redoing the copy with normal READ/WRITE?
> > > >>
> > > >> Maybe we need to have the interface return a hard error in that
> > > >> case and not try to give back any sort of offset?
> > > > 
> > > > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > > > expect us to simply ignore it and only implement my new CLONE
> > > > operation with sane semantics.  That is unless someone can show some
> > > > real life use case for the inter server copy, in which case we'll
> > > > have to deal with that mess.  But getting that one right at the VFS
> > > > level will be a nightmare anyway.
> > > > 
> > > > Make this a vote from me to not support partial copies and just
> > > > return and error in that case.
> > > 
> > > Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> > > ca_synchronous flags that let the client state if the copy should be
> > > done consecutively or synchronously.  I expected to always set
> > > consecutive to "true" for the Linux client.
> > 
> > That's supposed to mean results are well-defined in the partial-copy
> > case, but I think Christoph's suggesting eliminating the partial-copy
> > case entirely?
> > 
> > Which would be fine with me.
> > 
> > It might actually have been me advocating for partial copies.  But that
> > was only because a partial-copy-handling-loop seemed simpler to me than
> > progress callbacks if we were going to support long-running copies.
> > 
> > I'm happy enough not to have it at all.
> 
> Ah, OK, that's great news.
> 
> I thought at one point we were worried about very long running RPCs on
> the server.  Are we not worried about that now?
> 
> Is the client expected to cut the work up into arbitrarily managable
> chunks?  Is the server expected to fail COPY/CLONE requests that it
> thinks would take way too long?  Something else?

Christoph is proposing a CLONE rpc that's required to be atomic:

	https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-35#section-15.13
	"The CLONE operation is atomic, that is either all changes or no
	changes are seen by the client or other clients."

So that couldn't be really long-running (or the server is nuts).

So that'd mean Anna would rip out the server-side copy loop and we'd
initially just support btrfs or whatever.

I mean the server-side copy loop may also be useful but I'm all for
wiring up the obvious case first.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:54                       ` Zach Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-14 18:54 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Anna Schumaker, Christoph Hellwig, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Tue, Apr 14, 2015 at 02:29:06PM -0400, J. Bruce Fields wrote:
> On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
> > On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> > > On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > > > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > > > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > > > >> Yuck! How the heck do you clean up the mess if that happens? I
> > > > >> guess you're just stuck redoing the copy with normal READ/WRITE?
> > > > >>
> > > > >> Maybe we need to have the interface return a hard error in that
> > > > >> case and not try to give back any sort of offset?
> > > > > 
> > > > > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > > > > expect us to simply ignore it and only implement my new CLONE
> > > > > operation with sane semantics.  That is unless someone can show some
> > > > > real life use case for the inter server copy, in which case we'll
> > > > > have to deal with that mess.  But getting that one right at the VFS
> > > > > level will be a nightmare anyway.
> > > > > 
> > > > > Make this a vote from me to not support partial copies and just
> > > > > return and error in that case.
> > > > 
> > > > Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> > > > ca_synchronous flags that let the client state if the copy should be
> > > > done consecutively or synchronously.  I expected to always set
> > > > consecutive to "true" for the Linux client.
> > > 
> > > That's supposed to mean results are well-defined in the partial-copy
> > > case, but I think Christoph's suggesting eliminating the partial-copy
> > > case entirely?
> > > 
> > > Which would be fine with me.
> > > 
> > > It might actually have been me advocating for partial copies.  But that
> > > was only because a partial-copy-handling-loop seemed simpler to me than
> > > progress callbacks if we were going to support long-running copies.
> > > 
> > > I'm happy enough not to have it at all.
> > 
> > Ah, OK, that's great news.
> > 
> > I thought at one point we were worried about very long running RPCs on
> > the server.  Are we not worried about that now?
> > 
> > Is the client expected to cut the work up into arbitrarily managable
> > chunks?  Is the server expected to fail COPY/CLONE requests that it
> > thinks would take way too long?  Something else?
> 
> Christoph is proposing a CLONE rpc that's required to be atomic:
> 
> 	https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-35#section-15.13
> 	"The CLONE operation is atomic, that is either all changes or no
> 	changes are seen by the client or other clients."
> 
> So that couldn't be really long-running (or the server is nuts).
> 
> So that'd mean Anna would rip out the server-side copy loop and we'd
> initially just support btrfs or whatever.

Is this relying on btrfs range cloning being atomic?  It certainly
doesn't look atomic.  It can modify items across an arbitrarily large
number of leaf blocks.  It can make the changes across multiple
transactions which could introduce partial modification on reboot after
crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
desintation partially modified.

> I mean the server-side copy loop may also be useful but I'm all for
> wiring up the obvious case first.

Sure, I'm all for wiring up the simple version that doesn't return
partial progress.  If that'll work for you guys.

- z

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 18:54                       ` Zach Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-14 18:54 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Anna Schumaker, Christoph Hellwig, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 02:29:06PM -0400, J. Bruce Fields wrote:
> On Tue, Apr 14, 2015 at 11:22:41AM -0700, Zach Brown wrote:
> > On Tue, Apr 14, 2015 at 02:19:11PM -0400, J. Bruce Fields wrote:
> > > On Tue, Apr 14, 2015 at 01:16:13PM -0400, Anna Schumaker wrote:
> > > > On 04/14/2015 12:53 PM, Christoph Hellwig wrote:
> > > > > On Sat, Apr 11, 2015 at 09:04:02AM -0400, Jeff Layton wrote:
> > > > >> Yuck! How the heck do you clean up the mess if that happens? I
> > > > >> guess you're just stuck redoing the copy with normal READ/WRITE?
> > > > >>
> > > > >> Maybe we need to have the interface return a hard error in that
> > > > >> case and not try to give back any sort of offset?
> > > > > 
> > > > > The NFSv4.2 COPY interface is a train wreck.  At least for Linux I'd
> > > > > expect us to simply ignore it and only implement my new CLONE
> > > > > operation with sane semantics.  That is unless someone can show some
> > > > > real life use case for the inter server copy, in which case we'll
> > > > > have to deal with that mess.  But getting that one right at the VFS
> > > > > level will be a nightmare anyway.
> > > > > 
> > > > > Make this a vote from me to not support partial copies and just
> > > > > return and error in that case.
> > > > 
> > > > Agreed.  Looking at the v4.2 spec, COPY does take ca_consecutive and a
> > > > ca_synchronous flags that let the client state if the copy should be
> > > > done consecutively or synchronously.  I expected to always set
> > > > consecutive to "true" for the Linux client.
> > > 
> > > That's supposed to mean results are well-defined in the partial-copy
> > > case, but I think Christoph's suggesting eliminating the partial-copy
> > > case entirely?
> > > 
> > > Which would be fine with me.
> > > 
> > > It might actually have been me advocating for partial copies.  But that
> > > was only because a partial-copy-handling-loop seemed simpler to me than
> > > progress callbacks if we were going to support long-running copies.
> > > 
> > > I'm happy enough not to have it at all.
> > 
> > Ah, OK, that's great news.
> > 
> > I thought at one point we were worried about very long running RPCs on
> > the server.  Are we not worried about that now?
> > 
> > Is the client expected to cut the work up into arbitrarily managable
> > chunks?  Is the server expected to fail COPY/CLONE requests that it
> > thinks would take way too long?  Something else?
> 
> Christoph is proposing a CLONE rpc that's required to be atomic:
> 
> 	https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-35#section-15.13
> 	"The CLONE operation is atomic, that is either all changes or no
> 	changes are seen by the client or other clients."
> 
> So that couldn't be really long-running (or the server is nuts).
> 
> So that'd mean Anna would rip out the server-side copy loop and we'd
> initially just support btrfs or whatever.

Is this relying on btrfs range cloning being atomic?  It certainly
doesn't look atomic.  It can modify items across an arbitrarily large
number of leaf blocks.  It can make the changes across multiple
transactions which could introduce partial modification on reboot after
crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
desintation partially modified.

> I mean the server-side copy loop may also be useful but I'm all for
> wiring up the obvious case first.

Sure, I'm all for wiring up the simple version that doesn't return
partial progress.  If that'll work for you guys.

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 19:23                         ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2015-04-14 19:23 UTC (permalink / raw)
  To: Zach Brown
  Cc: J. Bruce Fields, Anna Schumaker, Christoph Hellwig, Jeff Layton,
	Trond Myklebust, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs, Linux NFS Mailing List,
	linux-scsi

On Tue, Apr 14, 2015 at 11:54:08AM -0700, Zach Brown wrote:
> Is this relying on btrfs range cloning being atomic?  It certainly
> doesn't look atomic.  It can modify items across an arbitrarily large
> number of leaf blocks.  It can make the changes across multiple
> transactions which could introduce partial modification on reboot after
> crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
> desintation partially modified.

I didn't mean atomic in the failure atomic sense, but in the sense of
being atomic vs other writes, similar to how Posix specifies it for
writes vs other writes.  Guess I need to express this intent better.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 19:23                         ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2015-04-14 19:23 UTC (permalink / raw)
  To: Zach Brown
  Cc: J. Bruce Fields, Anna Schumaker, Christoph Hellwig, Jeff Layton,
	Trond Myklebust, Linux Kernel Mailing List,
	Linux FS-devel Mailing List, linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	Linux NFS Mailing List, linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 11:54:08AM -0700, Zach Brown wrote:
> Is this relying on btrfs range cloning being atomic?  It certainly
> doesn't look atomic.  It can modify items across an arbitrarily large
> number of leaf blocks.  It can make the changes across multiple
> transactions which could introduce partial modification on reboot after
> crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
> desintation partially modified.

I didn't mean atomic in the failure atomic sense, but in the sense of
being atomic vs other writes, similar to how Posix specifies it for
writes vs other writes.  Guess I need to express this intent better.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 20:04                           ` Zach Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-14 20:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: J. Bruce Fields, Anna Schumaker, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs, Linux NFS Mailing List, linux-scsi

On Tue, Apr 14, 2015 at 12:23:25PM -0700, Christoph Hellwig wrote:
> On Tue, Apr 14, 2015 at 11:54:08AM -0700, Zach Brown wrote:
> > Is this relying on btrfs range cloning being atomic?  It certainly
> > doesn't look atomic.  It can modify items across an arbitrarily large
> > number of leaf blocks.  It can make the changes across multiple
> > transactions which could introduce partial modification on reboot after
> > crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
> > desintation partially modified.
> 
> I didn't mean atomic in the failure atomic sense, but in the sense of
> being atomic vs other writes, similar to how Posix specifies it for
> writes vs other writes.  Guess I need to express this intent better.

Ah, right, OK.

- z

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper
@ 2015-04-14 20:04                           ` Zach Brown
  0 siblings, 0 replies; 33+ messages in thread
From: Zach Brown @ 2015-04-14 20:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: J. Bruce Fields, Anna Schumaker, Jeff Layton, Trond Myklebust,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Tue, Apr 14, 2015 at 12:23:25PM -0700, Christoph Hellwig wrote:
> On Tue, Apr 14, 2015 at 11:54:08AM -0700, Zach Brown wrote:
> > Is this relying on btrfs range cloning being atomic?  It certainly
> > doesn't look atomic.  It can modify items across an arbitrarily large
> > number of leaf blocks.  It can make the changes across multiple
> > transactions which could introduce partial modification on reboot after
> > crashes.  It can fail (the dynamic duo: enomem, eio) and leave the
> > desintation partially modified.
> 
> I didn't mean atomic in the failure atomic sense, but in the sense of
> being atomic vs other writes, similar to how Posix specifies it for
> writes vs other writes.  Guess I need to express this intent better.

Ah, right, OK.

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 0/3] simple copy offloading system call
@ 2015-05-06  6:15   ` Michael Kerrisk
  0 siblings, 0 replies; 33+ messages in thread
From: Michael Kerrisk @ 2015-05-06  6:15 UTC (permalink / raw)
  To: Zach Brown
  Cc: Linux Kernel, Linux-Fsdevel, Linux btrfs Developers List,
	linux-nfs, linux-scsi, Linux API

[CC += linux-api@vger.kernel.org]

Zach,

Since this is a kernel-user-space API change, please CC linux-api@.
The kernel source file Documentation/SubmitChecklist notes that all
Linux kernel patches that change userspace interfaces should be CCed
to linux-api@vger.kernel.org, so that the various parties who are
interested in API changes are informed. For further information, see
https://www.kernel.org/doc/man-pages/linux-api-ml.html

Thanks,

Michael




On Sat, Apr 11, 2015 at 12:00 AM, Zach Brown <zab@redhat.com> wrote:
> Hello everyone!
>
> Here's my current attempt at the most basic system call interface for
> offloading copying between files.  The system call and vfs function
> are relatively light wrappers around the file_operation method that
> does the heavy lifting.
>
> There was interest at LSF in getting the basic infrastructure merged
> before worrying about adding behavioural flags and more complicated
> implementations.  This series only offers a refactoring of the btrfs
> clone ioctl as an example of an implementation of the file
> copy_file_range method.
>
> I've added support for copy_file_range() to xfs_io in xfsprogs and
> have the start of an xfstest that tests the system call.  I'll send
> those to fstests@.
>
> So how does this look?
>
> Do we want to merge this and let the NFS and block XCOPY patches add
> their changes when they're ready?
>
> - z
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 0/3] simple copy offloading system call
@ 2015-05-06  6:15   ` Michael Kerrisk
  0 siblings, 0 replies; 33+ messages in thread
From: Michael Kerrisk @ 2015-05-06  6:15 UTC (permalink / raw)
  To: Zach Brown
  Cc: Linux Kernel, Linux-Fsdevel, Linux btrfs Developers List,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA, Linux API

[CC += linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org]

Zach,

Since this is a kernel-user-space API change, please CC linux-api@.
The kernel source file Documentation/SubmitChecklist notes that all
Linux kernel patches that change userspace interfaces should be CCed
to linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, so that the various parties who are
interested in API changes are informed. For further information, see
https://www.kernel.org/doc/man-pages/linux-api-ml.html

Thanks,

Michael




On Sat, Apr 11, 2015 at 12:00 AM, Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Hello everyone!
>
> Here's my current attempt at the most basic system call interface for
> offloading copying between files.  The system call and vfs function
> are relatively light wrappers around the file_operation method that
> does the heavy lifting.
>
> There was interest at LSF in getting the basic infrastructure merged
> before worrying about adding behavioural flags and more complicated
> implementations.  This series only offers a refactoring of the btrfs
> clone ioctl as an example of an implementation of the file
> copy_file_range method.
>
> I've added support for copy_file_range() to xfs_io in xfsprogs and
> have the start of an xfstest that tests the system call.  I'll send
> those to fstests@.
>
> So how does this look?
>
> Do we want to merge this and let the NFS and block XCOPY patches add
> their changes when they're ready?
>
> - z
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH RFC 0/3] simple copy offloading system call
  2015-05-06  6:15   ` Michael Kerrisk
  (?)
@ 2015-05-07  2:52   ` Andy Lutomirski
  -1 siblings, 0 replies; 33+ messages in thread
From: Andy Lutomirski @ 2015-05-07  2:52 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Linux FS Devel, linux-kernel, Linux API, Linux SCSI List,
	linux-nfs, Linux btrfs Developers List, Zach Brown

On May 6, 2015 11:45 AM, "Michael Kerrisk" <mtk.manpages@gmail.com> wrote:
>
> [CC += linux-api@vger.kernel.org]
>
> Zach,
>
> Since this is a kernel-user-space API change, please CC linux-api@.
> The kernel source file Documentation/SubmitChecklist notes that all
> Linux kernel patches that change userspace interfaces should be CCed
> to linux-api@vger.kernel.org, so that the various parties who are
> interested in API changes are informed. For further information, see
> https://www.kernel.org/doc/man-pages/linux-api-ml.html
>
> Thanks,
>
> Michael
>
>
>
>
> On Sat, Apr 11, 2015 at 12:00 AM, Zach Brown <zab@redhat.com> wrote:
> > Hello everyone!
> >
> > Here's my current attempt at the most basic system call interface for
> > offloading copying between files.  The system call and vfs function
> > are relatively light wrappers around the file_operation method that
> > does the heavy lifting.
> >
> > There was interest at LSF in getting the basic infrastructure merged
> > before worrying about adding behavioural flags and more complicated
> > implementations.  This series only offers a refactoring of the btrfs
> > clone ioctl as an example of an implementation of the file
> > copy_file_range method.
> >
> > I've added support for copy_file_range() to xfs_io in xfsprogs and
> > have the start of an xfstest that tests the system call.  I'll send
> > those to fstests@.
> >
> > So how does this look?
> >
> > Do we want to merge this and let the NFS and block XCOPY patches add
> > their changes when they're ready?

This sounds enough like splice that I'm wondering why the API isn't splice.

--Andy

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-05-07  2:52 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-10 22:00 [PATCH RFC 0/3] simple copy offloading system call Zach Brown
2015-04-10 22:00 ` [PATCH RFC 1/3] vfs: add copy_file_range syscall and vfs helper Zach Brown
2015-04-10 22:36   ` Trond Myklebust
2015-04-10 22:36     ` Trond Myklebust
2015-04-11  0:02     ` Zach Brown
2015-04-11  0:24       ` Trond Myklebust
2015-04-11 13:04         ` Jeff Layton
2015-04-13 16:32           ` Zach Brown
2015-04-14 16:53           ` Christoph Hellwig
2015-04-14 16:58             ` Christoph Hellwig
2015-04-14 17:16             ` Anna Schumaker
2015-04-14 17:16               ` Anna Schumaker
2015-04-14 17:16               ` Anna Schumaker
2015-04-14 18:19               ` J. Bruce Fields
2015-04-14 18:19                 ` J. Bruce Fields
2015-04-14 18:22                 ` Zach Brown
2015-04-14 18:22                   ` Zach Brown
2015-04-14 18:29                   ` J. Bruce Fields
2015-04-14 18:29                     ` J. Bruce Fields
2015-04-14 18:54                     ` Zach Brown
2015-04-14 18:54                       ` Zach Brown
2015-04-14 19:23                       ` Christoph Hellwig
2015-04-14 19:23                         ` Christoph Hellwig
2015-04-14 20:04                         ` Zach Brown
2015-04-14 20:04                           ` Zach Brown
2015-04-10 23:01   ` Andreas Dilger
2015-04-10 22:00 ` [PATCH RFC 2/3] x86: add sys_copy_file_range to syscall tables Zach Brown
2015-04-10 22:00 ` [PATCH RFC 3/3] btrfs: add .copy_file_range file operation Zach Brown
2015-04-14 17:08   ` Chris Mason
2015-04-14 17:08     ` Chris Mason
2015-05-06  6:15 ` [PATCH RFC 0/3] simple copy offloading system call Michael Kerrisk
2015-05-06  6:15   ` Michael Kerrisk
2015-05-07  2:52   ` Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.