linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] xfs: faster unaligned copy_file_range
@ 2020-12-01  3:37 Darrick J. Wong
  2020-12-01  3:37 ` [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls Darrick J. Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2020-12-01  3:37 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

Here's a quick patch to speed up copy_file_range on unaligned ranges by
using reflink when possible to reduce the amount of pagecache copying.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=copy-range-faster-5.11
---
 fs/xfs/xfs_file.c |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-01  3:37 [PATCH 0/1] xfs: faster unaligned copy_file_range Darrick J. Wong
@ 2020-12-01  3:37 ` Darrick J. Wong
  2020-12-01 10:02   ` Christoph Hellwig
  2020-12-01 15:25   ` Brian Foster
  0 siblings, 2 replies; 8+ messages in thread
From: Darrick J. Wong @ 2020-12-01  3:37 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Add a copy_file_range handler to XFS so that we can accelerate file
copies with reflink when the source and destination ranges are not
block-aligned.  We'll use the generic pagecache copy to handle the
unaligned edges and attempt to reflink the middle.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_file.c |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f73837..9d1bb0dc30e2 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1119,6 +1119,104 @@ xfs_file_remap_range(
 	return remapped > 0 ? remapped : ret;
 }
 
+/*
+ * Decide if we want to use reflink to accelerate a copy_file_range request.
+ *
+ * We need to use the generic pagecache copy routine if there's no reflink; if
+ * the two files are on different filesystems; if the two files are on
+ * different devices; or if the two offsets are not at the same offset within
+ * an fs block.  Studies on the author's computer show that reflink doesn't
+ * speed up copies smaller than 32k, so use the page cache for those.
+ */
+static inline bool
+xfs_want_reflink_copy_range(
+	struct xfs_inode	*src,
+	unsigned int		src_off,
+	struct xfs_inode	*dst,
+	unsigned int		dst_off,
+	size_t			len)
+{
+	struct xfs_mount	*mp = src->i_mount;
+
+	if (len < 32768)
+		return false;
+	if (mp != dst->i_mount)
+		return false;
+	if (!xfs_sb_version_hasreflink(&mp->m_sb))
+		return false;
+	if (XFS_IS_REALTIME_INODE(src) != XFS_IS_REALTIME_INODE(dst))
+		return false;
+	return (src_off & mp->m_blockmask) == (dst_off & mp->m_blockmask);
+}
+
+STATIC ssize_t
+xfs_file_copy_range(
+	struct file		*src_file,
+	loff_t			src_off,
+	struct file		*dst_file,
+	loff_t			dst_off,
+	size_t			len,
+	unsigned int		flags)
+{
+	struct inode		*inode_src = file_inode(src_file);
+	struct xfs_inode	*src = XFS_I(inode_src);
+	struct inode		*inode_dst = file_inode(dst_file);
+	struct xfs_inode	*dst = XFS_I(inode_dst);
+	struct xfs_mount	*mp = src->i_mount;
+	loff_t			copy_ret;
+	loff_t			next_block;
+	size_t			copy_len;
+	ssize_t			total_copied = 0;
+
+	/* Bypass all this if no copy acceleration is possible. */
+	if (!xfs_want_reflink_copy_range(src, src_off, dst, dst_off, len))
+		goto use_generic;
+
+	/* Use the regular copy until we're block aligned at the start. */
+	next_block = round_up(src_off + 1, mp->m_sb.sb_blocksize);
+	copy_len = min_t(size_t, len, next_block - src_off);
+	if (copy_len > 0) {
+		copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
+					dst_off, copy_len, flags);
+		if (copy_ret < 0)
+			return copy_ret;
+
+		src_off += copy_ret;
+		dst_off += copy_ret;
+		len -= copy_ret;
+		total_copied += copy_ret;
+		if (copy_ret < copy_len || len == 0)
+			return total_copied;
+	}
+
+	/*
+	 * Now try to reflink as many full blocks as we can.  If the end of the
+	 * copy request wasn't block-aligned or the reflink fails, we'll just
+	 * fall into the generic copy to do the rest.
+	 */
+	copy_len = round_down(len, mp->m_sb.sb_blocksize);
+	if (copy_len > 0) {
+		copy_ret = xfs_file_remap_range(src_file, src_off, dst_file,
+				dst_off, copy_len, REMAP_FILE_CAN_SHORTEN);
+		if (copy_ret >= 0) {
+			src_off += copy_ret;
+			dst_off += copy_ret;
+			len -= copy_ret;
+			total_copied += copy_ret;
+			if (copy_ret < copy_len || len == 0)
+				return total_copied;
+		}
+	}
+
+use_generic:
+	/* Use the regular copy to deal with leftover bytes. */
+	copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
+			dst_off, len, flags);
+	if (copy_ret < 0)
+		return copy_ret;
+	return total_copied + copy_ret;
+}
+
 STATIC int
 xfs_file_open(
 	struct inode	*inode,
@@ -1381,6 +1479,7 @@ const struct file_operations xfs_file_operations = {
 	.get_unmapped_area = thp_get_unmapped_area,
 	.fallocate	= xfs_file_fallocate,
 	.fadvise	= xfs_file_fadvise,
+	.copy_file_range = xfs_file_copy_range,
 	.remap_file_range = xfs_file_remap_range,
 };
 


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-01  3:37 ` [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls Darrick J. Wong
@ 2020-12-01 10:02   ` Christoph Hellwig
  2020-12-06 23:21     ` Darrick J. Wong
  2020-12-01 15:25   ` Brian Foster
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2020-12-01 10:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Add a copy_file_range handler to XFS so that we can accelerate file
> copies with reflink when the source and destination ranges are not
> block-aligned.  We'll use the generic pagecache copy to handle the
> unaligned edges and attempt to reflink the middle.

Isn't this something we could better handle in the VFS (or a generic
helper) so that all file systems that support reflink could benefit?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-01  3:37 ` [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls Darrick J. Wong
  2020-12-01 10:02   ` Christoph Hellwig
@ 2020-12-01 15:25   ` Brian Foster
  2020-12-06 23:24     ` Darrick J. Wong
  1 sibling, 1 reply; 8+ messages in thread
From: Brian Foster @ 2020-12-01 15:25 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Add a copy_file_range handler to XFS so that we can accelerate file
> copies with reflink when the source and destination ranges are not
> block-aligned.  We'll use the generic pagecache copy to handle the
> unaligned edges and attempt to reflink the middle.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_file.c |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 99 insertions(+)
> 
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5b0f93f73837..9d1bb0dc30e2 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1119,6 +1119,104 @@ xfs_file_remap_range(
>  	return remapped > 0 ? remapped : ret;
>  }
>  
...
> +STATIC ssize_t
> +xfs_file_copy_range(
> +	struct file		*src_file,
> +	loff_t			src_off,
> +	struct file		*dst_file,
> +	loff_t			dst_off,
> +	size_t			len,
> +	unsigned int		flags)
> +{
> +	struct inode		*inode_src = file_inode(src_file);
> +	struct xfs_inode	*src = XFS_I(inode_src);
> +	struct inode		*inode_dst = file_inode(dst_file);
> +	struct xfs_inode	*dst = XFS_I(inode_dst);
> +	struct xfs_mount	*mp = src->i_mount;
> +	loff_t			copy_ret;
> +	loff_t			next_block;
> +	size_t			copy_len;
> +	ssize_t			total_copied = 0;
> +
> +	/* Bypass all this if no copy acceleration is possible. */
> +	if (!xfs_want_reflink_copy_range(src, src_off, dst, dst_off, len))
> +		goto use_generic;
> +
> +	/* Use the regular copy until we're block aligned at the start. */
> +	next_block = round_up(src_off + 1, mp->m_sb.sb_blocksize);

Why the +1? AFAICT this means we manually copy the first block if
src_off does happen to be block aligned. Is this an assumption based on
the caller attempting ->remap_file_range() first?

BTW, if we do happen to be called in some (theoretical) corner case
where remap doesn't work unrelated to alignment, it seems this would
unconditionally break the manual copy into multiple parts (first block +
the rest). It's not immediately clear to me if that's significant from a
performance perspective, but I wonder if it would be nicer here to
filter that out more explicitly. For example, run the remap checks on
the block aligned offset/len first, or skip the remap if the caller has
provided a block aligned start (i.e. hinting that remap failed for other
reasons), or perhaps even implement this so it conditionally performs a
short manual copy so the next retry would fall into ->remap_file_range()
with aligned offsets, etc. Thoughts?

> +	copy_len = min_t(size_t, len, next_block - src_off);
> +	if (copy_len > 0) {
> +		copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> +					dst_off, copy_len, flags);
> +		if (copy_ret < 0)
> +			return copy_ret;
> +
> +		src_off += copy_ret;
> +		dst_off += copy_ret;
> +		len -= copy_ret;
> +		total_copied += copy_ret;
> +		if (copy_ret < copy_len || len == 0)
> +			return total_copied;
> +	}
> +
> +	/*
> +	 * Now try to reflink as many full blocks as we can.  If the end of the
> +	 * copy request wasn't block-aligned or the reflink fails, we'll just
> +	 * fall into the generic copy to do the rest.
> +	 */
> +	copy_len = round_down(len, mp->m_sb.sb_blocksize);
> +	if (copy_len > 0) {
> +		copy_ret = xfs_file_remap_range(src_file, src_off, dst_file,
> +				dst_off, copy_len, REMAP_FILE_CAN_SHORTEN);
> +		if (copy_ret >= 0) {
> +			src_off += copy_ret;
> +			dst_off += copy_ret;
> +			len -= copy_ret;
> +			total_copied += copy_ret;
> +			if (copy_ret < copy_len || len == 0)
> +				return total_copied;

Any reason we return a potential short copy here, but fall into the
manual copy if the reflink outright fails?

> +		}
> +	}
> +
> +use_generic:
> +	/* Use the regular copy to deal with leftover bytes. */
> +	copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> +			dst_off, len, flags);
> +	if (copy_ret < 0)
> +		return copy_ret;

Perhaps this should also check/return total_copied in the event we've
already done some work..?

Brian

> +	return total_copied + copy_ret;
> +}
> +
>  STATIC int
>  xfs_file_open(
>  	struct inode	*inode,
> @@ -1381,6 +1479,7 @@ const struct file_operations xfs_file_operations = {
>  	.get_unmapped_area = thp_get_unmapped_area,
>  	.fallocate	= xfs_file_fallocate,
>  	.fadvise	= xfs_file_fadvise,
> +	.copy_file_range = xfs_file_copy_range,
>  	.remap_file_range = xfs_file_remap_range,
>  };
>  
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-01 10:02   ` Christoph Hellwig
@ 2020-12-06 23:21     ` Darrick J. Wong
  2020-12-07 14:20       ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2020-12-06 23:21 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Tue, Dec 01, 2020 at 10:02:06AM +0000, Christoph Hellwig wrote:
> On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Add a copy_file_range handler to XFS so that we can accelerate file
> > copies with reflink when the source and destination ranges are not
> > block-aligned.  We'll use the generic pagecache copy to handle the
> > unaligned edges and attempt to reflink the middle.
> 
> Isn't this something we could better handle in the VFS (or a generic
> helper) so that all file systems that support reflink could benefit?

Maybe.  I don't know if it's universally true that all filesystems
should fall back to reflinking the middle range and pagecache copying
the unaligned start/end.

The other thing is that xfs can easily support reflink on rtextsize > 1,
but that adds the requirement that we set i_blocksize to a larger value
than we do now... or find some other way to convey allocation unit size
to a generic version of the fallback.  OTOH that's pretty easy to do
from xfs_copy_file_range.

--D

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-01 15:25   ` Brian Foster
@ 2020-12-06 23:24     ` Darrick J. Wong
  2020-12-07 14:01       ` Brian Foster
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2020-12-06 23:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Dec 01, 2020 at 10:25:48AM -0500, Brian Foster wrote:
> On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Add a copy_file_range handler to XFS so that we can accelerate file
> > copies with reflink when the source and destination ranges are not
> > block-aligned.  We'll use the generic pagecache copy to handle the
> > unaligned edges and attempt to reflink the middle.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_file.c |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 99 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 5b0f93f73837..9d1bb0dc30e2 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1119,6 +1119,104 @@ xfs_file_remap_range(
> >  	return remapped > 0 ? remapped : ret;
> >  }
> >  
> ...
> > +STATIC ssize_t
> > +xfs_file_copy_range(
> > +	struct file		*src_file,
> > +	loff_t			src_off,
> > +	struct file		*dst_file,
> > +	loff_t			dst_off,
> > +	size_t			len,
> > +	unsigned int		flags)
> > +{
> > +	struct inode		*inode_src = file_inode(src_file);
> > +	struct xfs_inode	*src = XFS_I(inode_src);
> > +	struct inode		*inode_dst = file_inode(dst_file);
> > +	struct xfs_inode	*dst = XFS_I(inode_dst);
> > +	struct xfs_mount	*mp = src->i_mount;
> > +	loff_t			copy_ret;
> > +	loff_t			next_block;
> > +	size_t			copy_len;
> > +	ssize_t			total_copied = 0;
> > +
> > +	/* Bypass all this if no copy acceleration is possible. */
> > +	if (!xfs_want_reflink_copy_range(src, src_off, dst, dst_off, len))
> > +		goto use_generic;
> > +
> > +	/* Use the regular copy until we're block aligned at the start. */
> > +	next_block = round_up(src_off + 1, mp->m_sb.sb_blocksize);
> 
> Why the +1? AFAICT this means we manually copy the first block if
> src_off does happen to be block aligned. Is this an assumption based on
> the caller attempting ->remap_file_range() first?

Yes.  The VFS always tries that first.

> BTW, if we do happen to be called in some (theoretical) corner case
> where remap doesn't work unrelated to alignment, it seems this would
> unconditionally break the manual copy into multiple parts (first block +
> the rest). It's not immediately clear to me if that's significant from a
> performance perspective,

I doubt it, since that's usually just copying around the pagecache.

> but I wonder if it would be nicer here to
> filter that out more explicitly. For example, run the remap checks on
> the block aligned offset/len first, or skip the remap if the caller has
> provided a block aligned start (i.e. hinting that remap failed for other
> reasons),

Yes, checking the block alignment is a good suggestion.  Will fix.

> or perhaps even implement this so it conditionally performs a
> short manual copy so the next retry would fall into ->remap_file_range()
> with aligned offsets, etc.

Hm.  That could be a thing too, though my opinion is that we should make
as much progress as we can before exiting the kernel.

--D

> Thoughts?
> 
> > +	copy_len = min_t(size_t, len, next_block - src_off);
> > +	if (copy_len > 0) {
> > +		copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> > +					dst_off, copy_len, flags);
> > +		if (copy_ret < 0)
> > +			return copy_ret;
> > +
> > +		src_off += copy_ret;
> > +		dst_off += copy_ret;
> > +		len -= copy_ret;
> > +		total_copied += copy_ret;
> > +		if (copy_ret < copy_len || len == 0)
> > +			return total_copied;
> > +	}
> > +
> > +	/*
> > +	 * Now try to reflink as many full blocks as we can.  If the end of the
> > +	 * copy request wasn't block-aligned or the reflink fails, we'll just
> > +	 * fall into the generic copy to do the rest.
> > +	 */
> > +	copy_len = round_down(len, mp->m_sb.sb_blocksize);
> > +	if (copy_len > 0) {
> > +		copy_ret = xfs_file_remap_range(src_file, src_off, dst_file,
> > +				dst_off, copy_len, REMAP_FILE_CAN_SHORTEN);
> > +		if (copy_ret >= 0) {
> > +			src_off += copy_ret;
> > +			dst_off += copy_ret;
> > +			len -= copy_ret;
> > +			total_copied += copy_ret;
> > +			if (copy_ret < copy_len || len == 0)
> > +				return total_copied;
> 
> Any reason we return a potential short copy here, but fall into the
> manual copy if the reflink outright fails?
> 
> > +		}
> > +	}
> > +
> > +use_generic:
> > +	/* Use the regular copy to deal with leftover bytes. */
> > +	copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> > +			dst_off, len, flags);
> > +	if (copy_ret < 0)
> > +		return copy_ret;
> 
> Perhaps this should also check/return total_copied in the event we've
> already done some work..?
> 
> Brian
> 
> > +	return total_copied + copy_ret;
> > +}
> > +
> >  STATIC int
> >  xfs_file_open(
> >  	struct inode	*inode,
> > @@ -1381,6 +1479,7 @@ const struct file_operations xfs_file_operations = {
> >  	.get_unmapped_area = thp_get_unmapped_area,
> >  	.fallocate	= xfs_file_fallocate,
> >  	.fadvise	= xfs_file_fadvise,
> > +	.copy_file_range = xfs_file_copy_range,
> >  	.remap_file_range = xfs_file_remap_range,
> >  };
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-06 23:24     ` Darrick J. Wong
@ 2020-12-07 14:01       ` Brian Foster
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2020-12-07 14:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Sun, Dec 06, 2020 at 03:24:54PM -0800, Darrick J. Wong wrote:
> On Tue, Dec 01, 2020 at 10:25:48AM -0500, Brian Foster wrote:
> > On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Add a copy_file_range handler to XFS so that we can accelerate file
> > > copies with reflink when the source and destination ranges are not
> > > block-aligned.  We'll use the generic pagecache copy to handle the
> > > unaligned edges and attempt to reflink the middle.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/xfs_file.c |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 99 insertions(+)
> > > 
> > > 
> > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > > index 5b0f93f73837..9d1bb0dc30e2 100644
> > > --- a/fs/xfs/xfs_file.c
> > > +++ b/fs/xfs/xfs_file.c
> > > @@ -1119,6 +1119,104 @@ xfs_file_remap_range(
> > >  	return remapped > 0 ? remapped : ret;
> > >  }
> > >  
> > ...
> > > +STATIC ssize_t
> > > +xfs_file_copy_range(
> > > +	struct file		*src_file,
> > > +	loff_t			src_off,
> > > +	struct file		*dst_file,
> > > +	loff_t			dst_off,
> > > +	size_t			len,
> > > +	unsigned int		flags)
> > > +{
> > > +	struct inode		*inode_src = file_inode(src_file);
> > > +	struct xfs_inode	*src = XFS_I(inode_src);
> > > +	struct inode		*inode_dst = file_inode(dst_file);
> > > +	struct xfs_inode	*dst = XFS_I(inode_dst);
> > > +	struct xfs_mount	*mp = src->i_mount;
> > > +	loff_t			copy_ret;
> > > +	loff_t			next_block;
> > > +	size_t			copy_len;
> > > +	ssize_t			total_copied = 0;
> > > +
> > > +	/* Bypass all this if no copy acceleration is possible. */
> > > +	if (!xfs_want_reflink_copy_range(src, src_off, dst, dst_off, len))
> > > +		goto use_generic;
> > > +
> > > +	/* Use the regular copy until we're block aligned at the start. */
> > > +	next_block = round_up(src_off + 1, mp->m_sb.sb_blocksize);
> > 
> > Why the +1? AFAICT this means we manually copy the first block if
> > src_off does happen to be block aligned. Is this an assumption based on
> > the caller attempting ->remap_file_range() first?
> 
> Yes.  The VFS always tries that first.
> 
> > BTW, if we do happen to be called in some (theoretical) corner case
> > where remap doesn't work unrelated to alignment, it seems this would
> > unconditionally break the manual copy into multiple parts (first block +
> > the rest). It's not immediately clear to me if that's significant from a
> > performance perspective,
> 
> I doubt it, since that's usually just copying around the pagecache.
> 

Ok, comment please.

> > but I wonder if it would be nicer here to
> > filter that out more explicitly. For example, run the remap checks on
> > the block aligned offset/len first, or skip the remap if the caller has
> > provided a block aligned start (i.e. hinting that remap failed for other
> > reasons),
> 
> Yes, checking the block alignment is a good suggestion.  Will fix.
> 
> > or perhaps even implement this so it conditionally performs a
> > short manual copy so the next retry would fall into ->remap_file_range()
> > with aligned offsets, etc.
> 
> Hm.  That could be a thing too, though my opinion is that we should make
> as much progress as we can before exiting the kernel.
> 

Yeah, the more I thought about this the more it seemed like a hack and
not really sane for a production system.

Brian

> --D
> 
> > Thoughts?
> > 
> > > +	copy_len = min_t(size_t, len, next_block - src_off);
> > > +	if (copy_len > 0) {
> > > +		copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> > > +					dst_off, copy_len, flags);
> > > +		if (copy_ret < 0)
> > > +			return copy_ret;
> > > +
> > > +		src_off += copy_ret;
> > > +		dst_off += copy_ret;
> > > +		len -= copy_ret;
> > > +		total_copied += copy_ret;
> > > +		if (copy_ret < copy_len || len == 0)
> > > +			return total_copied;
> > > +	}
> > > +
> > > +	/*
> > > +	 * Now try to reflink as many full blocks as we can.  If the end of the
> > > +	 * copy request wasn't block-aligned or the reflink fails, we'll just
> > > +	 * fall into the generic copy to do the rest.
> > > +	 */
> > > +	copy_len = round_down(len, mp->m_sb.sb_blocksize);
> > > +	if (copy_len > 0) {
> > > +		copy_ret = xfs_file_remap_range(src_file, src_off, dst_file,
> > > +				dst_off, copy_len, REMAP_FILE_CAN_SHORTEN);
> > > +		if (copy_ret >= 0) {
> > > +			src_off += copy_ret;
> > > +			dst_off += copy_ret;
> > > +			len -= copy_ret;
> > > +			total_copied += copy_ret;
> > > +			if (copy_ret < copy_len || len == 0)
> > > +				return total_copied;
> > 
> > Any reason we return a potential short copy here, but fall into the
> > manual copy if the reflink outright fails?
> > 
> > > +		}
> > > +	}
> > > +
> > > +use_generic:
> > > +	/* Use the regular copy to deal with leftover bytes. */
> > > +	copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> > > +			dst_off, len, flags);
> > > +	if (copy_ret < 0)
> > > +		return copy_ret;
> > 
> > Perhaps this should also check/return total_copied in the event we've
> > already done some work..?
> > 
> > Brian
> > 
> > > +	return total_copied + copy_ret;
> > > +}
> > > +
> > >  STATIC int
> > >  xfs_file_open(
> > >  	struct inode	*inode,
> > > @@ -1381,6 +1479,7 @@ const struct file_operations xfs_file_operations = {
> > >  	.get_unmapped_area = thp_get_unmapped_area,
> > >  	.fallocate	= xfs_file_fallocate,
> > >  	.fadvise	= xfs_file_fadvise,
> > > +	.copy_file_range = xfs_file_copy_range,
> > >  	.remap_file_range = xfs_file_remap_range,
> > >  };
> > >  
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
  2020-12-06 23:21     ` Darrick J. Wong
@ 2020-12-07 14:20       ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2020-12-07 14:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Sun, Dec 06, 2020 at 03:21:54PM -0800, Darrick J. Wong wrote:
> On Tue, Dec 01, 2020 at 10:02:06AM +0000, Christoph Hellwig wrote:
> > On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Add a copy_file_range handler to XFS so that we can accelerate file
> > > copies with reflink when the source and destination ranges are not
> > > block-aligned.  We'll use the generic pagecache copy to handle the
> > > unaligned edges and attempt to reflink the middle.
> > 
> > Isn't this something we could better handle in the VFS (or a generic
> > helper) so that all file systems that support reflink could benefit?
> 
> Maybe.  I don't know if it's universally true that all filesystems
> should fall back to reflinking the middle range and pagecache copying
> the unaligned start/end.
> 
> The other thing is that xfs can easily support reflink on rtextsize > 1,
> but that adds the requirement that we set i_blocksize to a larger value
> than we do now... or find some other way to convey allocation unit size
> to a generic version of the fallback.  OTOH that's pretty easy to do
> from xfs_copy_file_range.

I think you can basically turn xfs_want_reflink_copy_range into a
callback supplied by the fs for the generic helper, and to deal with
the rtextsize problem just return the relevant block size from the
helper.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-12-07 14:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-01  3:37 [PATCH 0/1] xfs: faster unaligned copy_file_range Darrick J. Wong
2020-12-01  3:37 ` [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls Darrick J. Wong
2020-12-01 10:02   ` Christoph Hellwig
2020-12-06 23:21     ` Darrick J. Wong
2020-12-07 14:20       ` Christoph Hellwig
2020-12-01 15:25   ` Brian Foster
2020-12-06 23:24     ` Darrick J. Wong
2020-12-07 14:01       ` Brian Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).